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Abstract 

The universality for the eigenvalue spacing statistics of generalized Wigner matrices was established 
in our previous work [19] under certain conditions on the probability distributions of the matrix elements. 
A major class of probability measures excluded in [19] are the Bernoulli measures. In this paper, we 
extend the universality result of [19] to include the Bernoulli measures so that the only restrictions on 
the probability distributions of the matrix elements are the subexponential decay and the normalization 
condition that the variances in each row sum up to one. The new ingredient is a strong local semicircle law 
which improves the error estimate on the Stieltjes transform of the empirical measure of the eigenvalues 
from the order {Nrj)~^^'^ to {Nr])~^ . Here rj is the imaginary part of the spectral parameter in the 
definition of the Stieltjes transform and A'' is the size of the matrix. 
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1 Introduction 



The universality of local eigenvalue statistics in the bulk of the spectrum of random matrices has been tra- 
ditionally considered only for invariant ensembles [4, 7, 8, 25]. For non-invariant ensembles, a new approach 
to prove the bulk universality was developed in [16, 14, 18, 19]. It consists of the following three steps: 

1. Local semicircle law. 

2. Universality for Gaussian divisible ensembles. 

3. Approximation by Gaussian divisible ensembles. 

In Step 2, the universality of the local eigenvalue statistics for a large class of matrices, i.e., Gaussian 
divisible matrices, was established. Thus in order to prove the universality of a given ensemble, it remains to 
approximate the matrix elements in this ensemble by Gaussian divisible distribution in such a way that the 
local eigenvalue statistics are unchanged. This approximation is intrinsically a density theorem and it can 
be achieved by perturbative expansions in several different ways. In the most recent approach [18, 19], the 
universality for Gaussian divisible ensembles was proved via the Dyson Brownian motion and the stability 
of eigenvalues in Step 3 was provided by the Green function comparison theorem. In Step 2 a technical tool, 
the logarithmic Sobolev inequality (LSI), was needed to estimate the fluctuations of eigenvalue distribution. 
This restriction could not be completely removed in Step 3 and thus the Bernoulli measures were excluded 
in [19]. In this paper, we will improve the local semicircle law so that the LSI is no longer needed. This 
will enable us to prove the universality for generalized Wigner matrices with Bernoulli distributions. As a 
byproduct of the new stronger form of local semicircle law, we also obtain much stronger estimates on the 
eigenvalue density and on the matrix elements of the resolvent. 

Recall the Stieltjes transform of the empirical measure of the eigenvalues is defined by 



■3 1 j= 

N 



N ^ Xi- z 

j=i J 

We have proved in [19] that the difference between mM{z) and msc{z), the Stieltjes transform of the semicircle 
law (2.9), is bounded by (A^yy)"^/^ where rj ~ 3m z. The main result of this paper states that the error can 
be improved to {Nr])~-^ . The improvement of a factor {Nr])~^/^ resembles the usual N~^^'^ factor in the 
central limit theorem and it results from a new estimate on the correlations of error terms. This estimate 
also implies that the error between the normalized empirical counting function of the eigenvalues and the 
one given by the semicircle law is less than N~^'^^ in the bulk of the spectrum for any e > 0. This new input 
is sufficiently strong to replace the usage of the (LSI) in [19], see the discussion after Theorem 2.2 for more 
details. 

Notice that this improvement of a factor (A^?y)~^/^ and the removal of the LSI need a substantial amount 
of work. Our motivations to take on this endeavor are for the following two reasons: (1) The distributions of 
the Bernoulli random matrices are very singular while the Gaussian measures in GOE are very smooth. It is 
not a priori clear that the universality holds for such singular distributions. (2) The adjacency matrices for 
random graphs are natural examples of symmetric random matrices. The matrix elements of these matrices 
take the values or 1 and thus they form Bernoulli random matrices. Our current results do not cover this 
case since we require the mean zero condition, but they represent the first step toward the universality of 
the adjacency matrices of random graphs. 
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2 Main results 



Wc now state the main results of this paper. Since all our results hold for both hermitian and symmetric 
ensembles, we will state the results for the hermitian case only. The modifications to the symmetric case 
are straightforward and they will be omitted. Let H = {hij)fj^i be an x hermitian matrix where 
the matrix elements hij = hji^ i j, are independent random variables given by a probability measure Vij 
with mean zero and variance afj . The variance of hij for i > j is afj = E | hij \ ^ = a'j^ . For simplicity of the 
presentation, we assume that for any fixed 1 <i < j < N, Kehij and Im hij are i.i.d. with distribution w^, 
i.e., Vij = bJij ®u)ij in the sense that Vij{dh) ~ ujij{dRc h)u)ij(dlTah), but this assumption is not essential for 
the result. The distribution Vij and its variance afj may depend on N, but we omit this fact in the notation. 
We assume that for any j fixed 

E4 = i- (2-1) 

i 

Matrices with independent, zero mean entries and with the normalization condition (2.1) will be called 
universal Wigner matrices. The basic parameter of such matrices is the quantity 



M — 2-. (2.2) 



maxy a-^ 



Define Cinf and Csup by 



C,„/ := inf {Naf^} < sup{Naf^} C,^p. (2.3) 



Note that Cinf = Csup{= 1) corresponds to the standard Wigner matrices and the conditions < Cinf < 
Csup < oo define more general Wigner matrices with comparable variances. 

We will also consider an even more general case when aij for different {i,j) indices are not comparable. 
A special case is the band matrix, where cr-y = for |i — j| > W with some parameter W . 

Denote by E := {<^ij}i^j=i the matrix of variances which is symmetric, doubly stochastic by (2.1), and 
in particular satisfies — 1 < S < 1- Let the spectrum of S be supported in 

Spec(I]) c [-l + 5_,l-(5+]U{l} (2.4) 

with some nonnegative constants d± . We will always have the following spectral assumption 

1 is a simple eigenvalue of Yi and (5_ is a positive constant, independent of N . (2-5) 

The local semicircle law will be proven under this general condition, but the precision of the estimate near 
the spectral edge will also depend on 5+ in an explicit way. For the orientation of the reader, we mention 
two special cases that provided the main motivation for our work. 

One important class of universal Wigner matrices is the generalized Wigner ensemble which is defined by 
the extra condition that 

< Cinf < Csup < oo, (2.6) 

It is easy to check that (2.4) holds with 

S± > C,nf. (2.7) 
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Another example is the band matrix ensemble whose variances are given by 



4=^-V(^), (2.8) 

where W > 1, f :M. M+ is a nonnegative symmetric function with / / = 1, / G L°°(M), and we defined 
[i — j]N S {1, 2, . . . N} by the property that [i — j]N = i — j mod N . The bandwidth M defined in (2.2) 
satisfies M < W/\\f\\oD- In Appendix A of [19], we have proved that (2.5) is satisfied for the choice of (2.8) 
if W is large enough. 



Define the Stieltjes transform of the empirical eigenvalue distribution of H by 

m(z) = rriNiz) := -^Tr — , z = E + ir]. 
N H — z 



Define msdz) as the unique solution of 



. , 1 

msciz) H ■ = 0, 

z + msc(z) 



with positive imaginary part for all z with Im z > 0, i.e., 



msciz) = ^ + ^^^ . (2.9) 

Here the square root function is chosen with a branch cut in the segment [—2,2] so that asymptotically 
^/ z"^ — 4 ~ z at infinity. This guarantees that the imaginary part of m^c is non negative for Im z > and it 
is the Wigner semicircle distribution 

1„ . . 1 



g,,iE) := lim -^m m,,{E + ir]) = —^{4-E^)+. (2.10) 

The Wigner semicircle law [32] states that mN{z) — > msc{z) for any fixed z, i.e., provided that rj is indepen- 
dent of N. We have proved [19] a local version of this result for universal Wigner matrices and the main 
result can be stated as the following probability estimate: 

P (\mN{z) - m,,(z)| > (logiV)^^^=^) < CN-'^^'°siosN) 

with some constant C2. The accuracy of this estimate can be improved from {Mrf)~^^'^ to {Mri)~^ 
which is the content of the next theorem. It summarizes the results of Theorems 4.1 and 5.1. Prior to our 
result in [19], a central limit theorem for the semicircle law on macroscopic scale for band matrices was 
established by Guionnet [21] and Anderson and Zeitouni [2]; a semicircle law for Gaussian band matrices 
was proved by Disertori, Pinson and Spencer [9]. For a review on band matrices, see the recent article [27] 
by Spencer. 

Theorem 2.1 (Local semicircle lav^r) Let H be a hermitian N x N random matrix with Khij = 0, 1 < 
*j j ^ o.iT'd assume that the variances of, satisfy (2.1) and (2.5). Suppose that the distributions of the 
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matrix elements have a uniformly subexponential decay in the sense that there exist constants a, 13 > 0, 
independent of N , such that for any x > we have 

P(l%l>^"kul)</3e-^ (2.11) 

We consider universal Wigner matrices and its special class, the generalized Wigner matrices in parallel. 
The parameter A will distinguish between the two cases; we set A = 2 for universal Wigner matrices, and 
A ~ 1 for generalized Wigner matrices, where the results will be stronger. 
Define the following domain in C 

D := 1^ = £; + i;?7 e C : |£;| < 5, < r; < 10, y/M7j > (log Nf^ {k + 11)^-'^^ (2.12) 

where k := | \E\ — 2|. Then there exist constants Ci, C2, C and c > 0, depending only on a, (3 and S_ in 
(2.5), such that for any e > and K > the Stieltjes transform of the empirical eigenvalue distribution of 
H satisfies 

^ (U — Wl ^ M^(^}) ^ %P (2-13) 

for sufficiently large N. Furthermore, the diagonal matrix elements of the Green function Gii{z) = [H — 
z)~^{i,i) satisfy that 



\zeD 

and for the off-diagonal elements we have 



y {max|G,,(z)-m,,(z)| > ^i^^£(Aj + r?)3-4}j < ciV-^(i°si°g^) (2.14) 



\zeD 

for any sufficiently large N . 



U {,nax|G,(z)|>^i^J^(. + ,)i}) <CA^^^^^^ (2.15) 



The subexponential decay condition (2.11) can also be easily weakened if we are not aiming at error 
estimates faster than any power law of A^. This can be easily carried out and we will not pursue it in this 
paper. 

Denote the eigenvalues of H by Ai,...,AAr and let pn{Xi, . . . , X^) be their (symmetric) probability 
density. For any k ~ 1,2, . . . , N the fc-point correlation function of the eigenvalues is defined by 

p''^\xi,X2,...Xk) ■■= I PN{xi,X2,...,XN)Axk+l...AxN- (2-16) 

We now state our main result concerning these correlation functions. The same result was proved in [19] 
under the additional assumption (2.26). 

Theorem 2.2 (Universality for generalized Wigner matrices) Consider a generalized hermitian 
Wigner ensemble such that (2.1), (2.5) and (2.6) hold. Suppose that the distributions Vij of the matrix 
elements have a uniformly subexponential decay in the sense of (2.11). Suppose that the real and imaginary 
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parts of hij are i.i.d., distributed according to Uij, i.e., i^ij^dh) = u!ij{d3mh)uJij{d^)\ch). Then for any k > I 
and for any compactly supported continuous test function O : M'^ — > M we have 

lim lim — / dE' I dai . . . dafc 0(ai, . . . , a^) 

where PqIj^ pf is the k-point correlation function for the GUE ensemble. The same statement holds for 
symmetric matrices, with GOE replacing the GUE ensemble. 

Remark. We can take b = N^'^ for some small constant c > so that there is no double limit taken. This 
is because all our bounds have an effective error estimate N^'''. In case of hermitian matrices there is no 
need for averaging in the energy parameter E'. The limit (2.17) holds even for any fixed energy E' , with 
\E'\ < 2, since, instead of relying on the local relaxation flow of [14, 18], we can use the result of [16] for 
Gaussian divisible ensembles at a fixed energy. 

It is well-known that the limiting correlation functions of the GUE ensemble are given by the sine kernel 
1 (k) Oil ^ ctk \ , fT,/ Ml- X sinTTx 

and a similar universal formula is available for the limiting gap distribution. The formulas for the GOE cases 
are more complicated and we refer the reader to standard references such as [1, 6, 20, 24]. 

We will prove Theorem 2.2 using the approach of [18, 19]. The logarithmic Sobolcv inequality was an 
important tool in these papers and it was the main obstacle why the case of Bernoulli random matrices were 
not covered. We note that the Bernoulli distribution satisfies the discrete version of the LSI but it would 
not be sufficient for our purposes. To explain the necessity of LSI, we now review the three basic ingredients 
of the approach of [18, 19]. 

Step 1. Local semicircle law: It states that the density of eigenvalues is given by the semicircle law down to 
short scales containing only N'^ eigenvalues for all e > 0, where N is the size of the matrix. 

Step 2. Local ergodicity of the Dyson Brownian motion: The Dyson Brownian motion is given by the flow 

Ht = e-'/^Ho + (1 - e-*)i/2 V, (2.18) 

where Hq is the initial Wigner matrix, V is an independent standard GUE (or GOE) matrix and t > 
is the time. Here we have used the version that the dynamics of the matrix element is given by an 
Ornstein-Uhlenbeck (OU) process on C. More precisely, let 



H = Hr^{dx) := — dx, Jf(x) = J{Ar(x) := iV 



^ ^2 ,J 



i=l i<j 



(2.19) 



be the probability measure of the eigenvalues x = {xi,X2, .. .,xn) of the general (3 ensemble, /3 > 1 
(/3 = 2 for the hermitian case and /? = 1 for the symmetric case). Denote the distribution of the 
eigenvalues of Hf at time t by /f(x)/^(dx). Then ft ~ ftjy satisfies [10] 

dtft = ^ft. (2.20) 
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where 



^ - E 5^3? + f ( - f + A ^ (2.21, 

i=l 1=1 \ j^i ■' / 

We now recall the following theorem concerning the universality of the Dyson Brownian motion. Fol- 
lowing the convention in [18], wc label the assumptions as Assumptions II-IV since the Assumption I, 
a convexity property of the Hamiltonian for the invariant measure of the Dyson Brownian motions, is 
automatically satisfied for any /3 ensembles. 

Assumption II. For any fixed a, & G R, wc have 



lim sup 

AT-i-oo t>o 



/■ 1 ^ .6 
J j-^i Ja 



0. (2.22) 



where Qsc is the density of the semicircle law (2.10). 

Let 7j = 7j,Ar denote the location of the j-th point under the semicircle law, i.e., "fj is defined by 

N r gsc{x)dx = J, l<j< N. (2.23) 



We will call the classical location of the j-th point. 
Assumption III. There exists an e > such that 

N 

/ N ^^"^^ ~ 7,)'/t(dx)^(dx) < (2.24) 

- j=i 

with a constant C uniformly in N. 

The final assumption is an upper bound on the local density. For any / g R, let 

JV 

:K/:=^l(a;, €/) 

i=l 

denote the number of eigenvalues in /. 

Assumption IV. For any compact subinterval Jq C (—2, 2) = {E : Qsc{E) > 0}, and for any 6 > 0, 
a > there are constant C„, n G N, depending on /q, S and a such that for any interval / C /q with 
|/| > N~^~^'^ and for any if > 1, we have 

sup I ll'Ni > KN\I\}frdfi<CnK-'', n^l,2,..., (2.25) 

where e is the exponent from Assumption III and a and S are arbitrarily small numbers. 

We have proved [19] that Assumption IV follows from the local semicircle law and Assumption III also 
follows from the local semicircle law provided that a uniform LSI for the distributions of the matrix 
elements is assumed. 
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Step 3. Green function comparison theorem: It asserts that the correlation functions of the eigenvalues of two 
matrix ensembles are identical up to the scale \/N provided that the first four moments of the matrix 
elements of these two ensembles are almost identical. Given this theorem and the universality for the 
Dyson Brownian motion for t ^ , the universality for a matrix ensemble H holds if we can find 
another matrix ensemble Hq such that the first four moments of the matrix elements of H and Ht 
(given by (2.18)) are almost the same. Furthermore, Hq is required to satisfy a uniform LSI so that 
the Assumption III can be verified. This is possible if the first four moments of Hq satisfy 



where mk{i,j) is the fc-th moment of the i,j matrix clement in the symmetric case. In the hermitian 
case, the moments of the real and imaginary parts have to satisfy (2.26). 

Combining these ingredients, the universality of local eigenvalue statistics in the bulk was proved for 
all generalized Wigner ensembles (see (2.6) for the definition) satisfying (2.26) and a subexponential decay 
technical condition. The restriction (2.26) was needed to guarantee the existence of a matching matrix 
ensemble whose matrix element distributions satisfy the LSI so that the Assumption III can be verified. The 
local semicircle estimates in Theorem 2.1 imply that the empirical counting function of the eigenvalues is 
close to the semicircle counting function (Theorem 6.3) and that the location of the eigenvalues are close to 
their classical location in mean square deviation sense (Theorem 7.1). This provides a direct proof to the 
Assumption III (2.24) and thus removes the usage of the LSI. 

Finally we summarize the recent results related to the bulk universality of local eigenvalue statistics. The 
local semicircle law for Step 1 was first established for Wigner matrices in a series of papers [11, 12, 13]. 
The method was based on a self-consistent equation for the Stieltjes transform of the eigenvalues and the 
continuity of the imaginary part of the spectral parameter in the Stieltjes transform. As a by-product, an 
eigenvector delocalization estimate was proved. 

The universality for Gaussian divisible ensembles was proved by Johansson [23] for hermitian Wigner 
ensembles. It was extended to complex sample covariance matrices by Ben Arous and Pcchc [3]. There were 
two major restrictions of this method: 1. The Gaussian component was fairly large, it was required to be 
of order one independent of A^. 2. It relies on explicit formulas for the correlation functions of eigenvalues 
which arc valid only for Gaussian divisible ensembles with unitary invariant Gaussian component. The size 
of the Gaussian component was reduced to N~^^'^ in [16] by using an improved formula for correlation 
functions and the local semicircle law from [11, 12, 13]. The Gaussian component was then removed by a 
perturbation argument using the reverse heat flow. Thus the three step strategy to prove the universality was 
introduced and it led to the first proof of the bulk universality for hermitian Wigner ensembles. Due to the 
reverse heat flow argument used in Step 3, the universality class established in [16] was restricted to matrices 
with smooth distributions for the matrix elements. Shortly after, Tao and Vu [28] proved the four moment 
theorem which in particular removes the smoothness restriction in Step 3. It thus proved the universality for 
hermitian Wigner matrices whose matrix element distributions were supported on at least three points. The 
last condition was removed in [17] by combining the arguments of [16, 28]. The result of [28] also implies 
that the local statistics of symmetric Wigner matrices and GOE arc the same, but under the restriction 
that the first four moments of the matrix elements match those of GOE. Thus the universality class for 
the local correlation functions established via the approach of combining [28] and [23] was broader for the 
hermitian ensembles than for the symmetric ones. This improvement was due to Johansson's result [23], 
which provided the universality for Gaussian divisible ensembles in Step 2, was available only for hermitian 
ensembles. 




(2.26) 
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A more general and conceptually very appealing approach for Step 2 is via the local ergodicity of Dyson 
Brownian motion. This approach, initiated in [14], was applied to prove the universality for symmetric 
Wigncr matrices with the three point support condition. In [18], wc formulated a general theorem for 
the bulk universality which applies to all classical ensembles, i.e., real and complex Wigner matrices, real 
and complex sample covariance matrices and quaternion Wigner matrices. Later on, Tao and Vu [29] also 
extended their results to the sample covariance matrices with the three point support condition for complex 
covariance matrices and four moment matching conditions for real ones. Shortly after [29], Peche [26] also 
extended the approach [16] to the complex sample covariance matrices and proved the universality in the 
bulk. 

Most recently, we introduced [19] the Green function comparison theorem and extended the local semi- 
circle law to include the matrix elements of the Green functions. This allows us to remove the smoothness 
restriction from the reverse heat flow argument in Step 3 of our approach. We remark that the comparison 
theorems in [28] concern individual eigenvalues with a fixed index, while the Green function comparison 
theorem is at a fixed energy. On the other hand, in [19] the variances of the matrix elements were allowed 
to vary, i.e., the matrices belonged to generalized Wigner ensembles. The three step strategy can thus be 
applied and the universality was proved for generalized Wigner ensembles with essentially only one class 
of measures, the Bernoulli measures, excluded due to the LSI used in verifying Assumption III in Step 2. 
Finally, in the current paper. Assumption III will be shown to be a consequence of a strong local semicircle 
law, which will be proved for all ensembles with a subexponential decay property. In particular, Bernoulli 
measures are now included in the universality class (in the sense of (2.17)) for both hermitian and symmetric 
generalized Wigner ensembles. We have thus removed all restrictions except the subexponential decay in 
our approach. A clear picture of the three step strategy emerges: Step 2 and 3 hold under very general 
conditions and are model independent. The main task of proving the universality is to establish a strong 
version of the local semicircle law — which can be model dependent. We believe that our method applies to 
generalized sample covariance matrices as well, but we will not pursue this direction in this paper. 



3 Proof of Universality 

We now prove the main universality theorem. Theorem 2.2. 

Step 1. Universality for Dyson Brownian Motion: Under the Assumptions II-IV in the introduction, the 
universality for the Dyson Brownian Motion was proved in [18]. We recall the statement in the following 
Theorem. 

Theorem 3.1 [Theorem 2.1 of [18]] Let e > be the exponent from Assumption III. Suppose that the 
Assumptions II, III and IV hold for the solution ft of the forward equation (2.20) for all time t > iV~^^. 
Let E € ^ be a point where g{E) > 0. Then for any k > 1 and for any compactly supported continuous test 
function O : M'' — > M, we have 



Y j-E+b ^ 

lim lim sup — / dE' / dai . . . da^ 0(ai, . . . , c^k) 



ctk 



(3.1) 



0. 



g(£;)n^*'^ ^P.A'yV ' NQ{Ey'' Ng{E)J 

Notice that the assumption on the initial entropy is not needed as was remarked in [19]. 

Step 2 Universality for Gaussian divisible ensembles: The Dyson Brownian motion is generated by the 
matrix flow (2.18). Our task is to determine the initial ensemble Ho so that the Assumptions II-IV of 
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Theorem 3.1 can be proved for the flow. The Assumption IV is a direct consequence of the local semicircle 
law, i.e., Theorem 4.1. The Assumption III will be proved in Proposition 7.1. For the generalized Wigner 
matrices, the only assumption of Theorem 4.1 and Proposition 7.1 is the subexponential decay property of 
the distributions of the matrix elements. Since the evolution of the matrix element is given by an Ornstein- 
Uhlcnbeck process, the subexponential property is preserved and we only have to check it for the initial data. 
We have thus proved the following theorem. 

Theorem 3.2 Suppose that the probability law for the initial matrix Hq satisfies the assumptions of Theorem 
2.2. Then there exists £o > such that for any t > A^^^°, the probability law for the eigenvalues of Ht satisfies 
the universality equation (2.17). 



Step 3 Green function comparison theorem: 
matrix element at {i,j) distributed by cFijCt 



We have proved the universality for all ensembles with the 
with 



e,' + (1 



-t^l/2 ^ij 



G ' 



(3.2) 



where are independent Gaussian random variables with mean and variance 1 and t ~ N^^ . In order 
to prove Theorem 2.2, it remains to approximate all random variables with the subexponential property by 
S^f The only requirement of is the subexponential decay property and the mean zero and variance one 
normalization. Our tool is the following Green function comparison theorem from [19]. It implies that the 
correlation functions of the eigenvalues of two matrix ensembles at a fixed energy are identical up to the scale 
1 /N provided that the first four moments of the matrix elements of these two ensembles are almost identical. 
Prior to this theorem, it was [28] proved that the joint distribution of individual eigenvalues for Wigner 
ensembles is the same under the four moment assumption. Tao-Vu's theorem addresses the distribution 
of individual eigenvalues^ while Theorem 3.3 compares Green functions (and thus eigenvalues) at a fixed 
energy. 

Theorem 3.3 Suppose that we have two generalized N x N Wigner matrices, H^'"'^ and H''™\ with matrix 



elements hij given by the random variables N~^^^Vij and N~^^'^Wij , respectively, with v. 
the uniform subexponential decay condition (2.11). 
independent matrix elements, 



id Wij satisfying 

Fix a bijective ordering map on the index set of the 



0:{(*,j):l<*<J<A^}^{l,...,7(^)}, 



Let 



and denote by the generalized Wigner matrix whose matrix elements hij follow the v- distribution if 
(i^ihj) ^ 7 o,''^d they follow the w- distribution otherwise; in particular TJ^"-* = Hq and iJ^™) = H, 
K > be arbitrary and suppose that for any small parameter t > and for any y > N^'^^' 
following estimate on the diagonal elements of the resolvent: 



have the 



max max max 

0<7<7(Ar) l<k<N |_E|<2-f, 



1 



ly 



kk 



(3.3) 



^In a recent preprint [31] (appeared after the current preprint was first posted), it was pointed out that if the four moment 
condition is violated, then the differences between individual eigenvalues of the two ensembles are bigger than the eigenvalue 
spacing. Thus the four moment condition is also necessary for locating the individual eigenvalues. This is in contrast with the 
main theme of this paper that gap distribution and correlation functions are even independent of the second moments as long 
as they are nonzero. 
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with some constants C, c depending only on t, k. Moreover, we assume that the first three moments of Vi 



and Wij are the same, i. e. 



< s + w< 3, 

and the difference between the fourth moments of Vij and wtj is much less than 1, say 



< N' 



0,1,2,3,4, 



(3.4) 



for some given S > 0. Let e > be arbitrary and choose an rj with N ^ ^ < rj < N ^ . For any sequence 
of positive integers ki,...,k„, set complex parameters z™ = EJ^^irj, j = l,...ki, m ~ l,...,n with 
< 2 — 2k and with an arbitrary choice of the ± signs. Let G^"""^ (z) = [H^""^ — z)^^ be the resolvent and 
let F(xi, . . . , Xn) be a function such that for any multi-index a — {ai, . . . , a„) with 1 < |q;| < 5 and for any 
e' > sufficiently small, we have 

max||a"F(a;i,...,x„)| : niax|a;j| < TV^'j < N'^""' (3.5) 



and 



max'^ \d°'F{xi, 



, Xn)\ ■ max \xj I < N' 
j 



(3.6) 



for some constant Co- 

Then, there is a constant Ci, depending on a, j3, fc; and C'o such that for any rj withN"^'- <r] < N-'^ 
and for any choices of the signs in the imaginary part of z™ 



EF 



1 



■Tr 



1 



■Tr 



Y[G^^\z'/) 



(3.7) 



where in the second term the arguments of F are changed from the Green functions of to and all 

other parameters remain unchanged. 

Given this theorem, for any matrix ensemble H whose matrix element at («, j) arc distributed according 
to CTijC^ , wc need to find ^q-* such that the first four moments of C*^ and are almost the same and ^g-* 
has a subexponential decay. Since the real and imaginary parts are i.i.d., it is sufficient to match them 
individually. This is the content of the following lemma which is stated for real random variables normalized 
to variance one. With this lemma, we have proved Theorem 2.2. This lemma is essentially the same as 
Lemma 28 in [28]. 

Lemma 3.4 Let and 7714 be two real numbers such that 

7714 — 7713 — 1 > 0, 7Tl4 < C2 

for some positive constant C2 • Let ^'^ be a real Gaussian random variable with mean and variance 1 . Then 
for any sufficient small 7 > (depending on G2), there exists a real random variable with subexponential 
decay and independent of , such that the first four moments of 
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are = 0, m2{C) = 1; ™3(^') = "^3 ^.i^d, m4(^'), and 

- mil < C-f (3.8) 

for some positive constant C depending on C2. 

Proof. It is easy to see by an explicit construction that the fohowing holds: 

For any given numbers m^, TO4, with 7/14 — ni^ — 1 > there is a random 
variable X with first four moments 0, 1, 7713, and with subexponential decay. (3-9) 

For any real random variable Cj independent of ^'^ , and with the first 4 moments being 0, 1, m3(C) and 
TO4(C) < 00, the first 4 moments of 

are 0, 1, 

m3(C') = (l-7)'/'m3(C) (3.10) 

and 

miiC) = (1 - jfrn^iC) + 67 ~ 372. (3.11) 

Using (3.9), we obtain that for any 7 > there exists a real random variable ^-j, such that the first four 
moments are 0, 1, 

msi^-r) = (1 - 7)"'/''7i3 

and 

m4(^-y) = m3(^-y)^ + (m4 - ml). 
With TO4 < C2, we have rrig < Cj^^, thus 

17714(^7) - '774! < C7 

for some positive constant C depending on C2. Hence with (3.10) and (3.11), we obtain that C' = (1 ^ 
lY^'^S.'r + 7^^^^*^ satisfies 7773(^') = 7713 and (3.8). This completes the proof of Lemma 3.4. q 

4 Large Deviation of Local Semicircle Law 

Wc first reprove the large deviation of local semicircle law given in [19]. The result of this section is relevant 
only for 77 > M~^. 

Theorem 4.1 Assume the N x N random matrix H satisfies (2.1), (2.4), (2.5) and (2.11), E/i^ = 0, for 
any I < i,j < N . Let z = E + irj (77 > 0) and let 9{z) be a non-negative function defined by 

^"^^^^ |l- 777,e(^)2| ^max{5+ , |9^e 777^^ (z) - 1 1 } ' ^^'^^ 

Let K = \\E\ — 2|. Then for all z = E + irj with 

\E\<5, -^<77<10, ^/Il^> (log N)^^+'^"'0^{z){K + T]y^^ (4.2) 
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■\Gu{z) - msc{z)\ > (logiV)6+2«i^l±^i^! e{z)\ < CiV--(i°gi°s^) 



max|Gij(z)| > {\ogNf+^°' 



(« + v) 



1/4 



(4.3) 



(4.4) 



for sufficiently large N with positive some constants c and C > that depend only a and (3 in (2.11) and 5- 
in (2.4) and (2.5). 

The theorem will be proved at the end of the section after collecting several lemmas. The first lemma 
describes the behavior of rUsc in the various regimes, its proof is elementary calculus. We use the notation 
f ^ g for two positive functions in some domain D if there is a positive universal constant C such that 
C-^ < f{z)/g{z) < C holds for ah z e D. 



Lemma 4.2 We have for all z with 3m z > that 

\msciz)\ = \msc{z) + z\^^ < 1. 
From now on, let z — E + irj with \E\ < 5 and j] > 0. If rj > 10, then we have 

3m m,c{z) ^ r]^'^, \msc{z)\ ij^'^ , |1 - m^^(z)| - 1, |1 - $Hem^^(z)| - 1. 
If V l£ 10, then we have 

|m,c(z)| 1, \l~ml^{z) \ - + 

For the behavior of |1 — D\cm'^^(z)\ and 3mmsc{z) we distinguish two cases. 
Case 1. For \E\ > 2 we have 



if K>rj 



3mmsc{z) 



^Jn + T] if K < 7] 



|1 - 5He mf,(z) I -V^^. 



Case 2. For \E\ < 2 we have 



3mmsc{z) ^ ^/k + rj, 



\l-mtml^iz)\ 



/K+rj 



if'n<K 



\/k + rj if K < ij 



Thus the control function 9{z) has the following behavior 

1 



e{z) 



if 7] > 10, 

{S^\ y/^/v, K-^} if 77 < 10, < 2 and K > r;, 

(k + 77)-i/2 if ,^ < fo, and {2 < < 10 or k < 77}. 



(4.5) 

(4.6) 
(4.7) 



(4.8) 



(4.9) 
□ 



(4.10) 
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Note that the precise formula (4.1) for 9{z) is not important, only its asymptotic behavior for small k, 77 
and S+ is relevant. The theorem remains valid if 9{z) is replaced by 0{z) with 9{z) < C9{z). In particular, 
9{z) can be chosen to be order one when E is not near the edges of the spectrum. If we are only concerned 
with the generalized Wigner ensemble (2.6), then by (2.7) we can choose 9{z) = {k+t])^^/'^ for any z ~ E + ir] 
{•q > 0). For universal Wigner matrices we have 9{z) < C{k + ri)~-^ for \z\ < 10, i.e., using the parameter A 
introduced in Theorem 2.1, we have 

Oiz)< ^ A =1,2, |i?|,r;<10. (4.11) 

Based upon these formulas, we also have, for any z ~ E + irj with i] > 0, 

3mmsc{z) + -J— < Cmm{l,y/iiTT]}. (4.12) 



First, we introduce some notations. Recall that Gij = Gij{z) denotes the matrix element 



and 




Gu{z). 



1=1 



Definition 4.1 Let T = {fci, ^2, . . kt} C {1, 2, . . . , N} he an unordered set of |T| = t elements and let 
H^'^^ he the N — t hy N — t minor of H after removing the kj-th {1 < i < t) rows and columns. For T = 0, 
we have i/^") = H. Similarly, we define a^^' '^'^ the l-th column with ki-th {1 < i < t) elements removed. 
Sometimes, we just use the short notation a^^a^^' For any T C {1, 2, . . . , N} we introduce the following 
notations: 

(i7m-z)-i(z,,) 

a' . (ijm - z)- V = y: KG^M 

These quantities depend on z, hut we mostly neglect this dependence in the notation. 



,(T) 
.(T) 



K. 



(T) 



The following two results were proved in our previous work (Lemma 4.2 and Corollary B.3 of [19]) and 
they will be our key inputs. We start with the self-consistent perturbation formulas. 

Lemma 4.3 [Self- consistent Perturhation Formulas] Let T C {1, 2, . . . , A^}. For simplicity, we use the 
notation {if) for ({i} U T) and (ij T) for ({j,j} U T). Then we have the following identities: 

L For any i ^ T 

G^ = {Kf^r'. (4.13) 
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2. For i ^ j and i,j^T 



G 



(T) 



r-(T)^b T)^(ii T) 



-<^« -"-ij ■ 



T) _ ^(T)^(T),^(T),_i 



3. For i ^ j and i,j^T 

'ij ^ ji ^^jj 

4- For any indices i, j and k that are different and i,j, k ^ T 

^(T) ^{k T) _ ^(T)^(T),^(T)^_l 



(4.14) 
(4.15) 

(4.16) 



Lemma 4.4 Let a,; (1 < i < N) be N independent random complex variables with mean zero, variance 
and having the uniform subexponential decay (2.11). Let Ai, Bij gC < i, j < N ). Then we have that 

N 



i=l 



log log AT 



>(log7V)i+v(^|A,p)'^'| 

> (logiV)i+2"a2(^|B,,|2) I <civ-'°s'°s^, 

> (logiV)3+2v2(^|i?,yf)'^' I <cAr-i°gi°g^, 



(4.17) 
(4.18) 
(4.19) 



for some constants C depending on a and /3 in (2.11). 



We start with determining a system of self-consistent equations for the diagonal matrix elements of the 
resolvent. We can write Ga as follows, 



1 



where E^i = denotes the expectation with respect to the elements in the i-th column of the matrix H, 
i.e., w.r.t. a* = {hu, /i2i, . . . , h^i)* . Introduce the notations 



A ■ ^2 ^ , \ " 2 G'ij Gji 



Gi 



(4.20) 



and 



Using the fact that G^'-' ~ (_ff'*' — z) ^ is independent of a' and Ejjia^a; = dkicr"^!^, we obtain 



and 
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Denote by 

and we have the identity 



T, = T,(z) A, + ( K'i^ - E^K'i^) = A, + h,, - (4.21) 



Let 



Vi '■— Gii — rrisc, '~ TV ^ ^ ^ '~ TV ^ ^ " TV ^ '"sc/ 



We will estimate the following key quantities 

Ad max |vfc| = max |Gfcfc - m^d, Aq := max |Gfc£|, (4.23) 
k k k^e 

where the subscripts refer to "diagonal" and "offdiagonal" matrix elements. All the quantities defined so far 
depend on the spectral parameter z = E + irj, but we will mostly omit this fact from the notation. The real 
part E will always be kept fixed. For the imaginary part we will use a continuity argument at the end of 
the proof and then the dependence of Ad^o on rj will be indicated. 

Both quantities A^ and Aq will be typically small, eventually we will prove that their size is less than 
(A/?;)^^/^, modulo logarithmic corrections and a factor involving the distance to the edge. We thus define 
the exceptional event 

= n^{z) := [Aaiz) + K{z) > ^^^^^^y^}- (4-24) 
We will always work in fi^, and, in particular, we will have 

Ad(z) + Ao(z)<C(log7V)-3/2 
since l/9{z) < C by (4.12). Define the set 

S ■.= {z^ E + iTj : \E\< 5, AT-i < rj < 10}. 

We thus have 

c<\Gu{z)\<C inni (4.25) 

for any z £ S with some universal constant c > 0. Here we estimated | \Gii\ — |msc| | < A,;, and we used from 
(4.6)-(4.7) that msciz) satisfies |msc(2:)| '--^ 1 for z G 5. 
Thus, a special case of (4.16) or (4.15), 



/^(i) GkiGii -111 

Kji. 



together with (4.25) implies that for any i and with a sufficiently large constant C 

max \Gf^ I < Ao + CA^ < CA^ in (4.26) 

C"^ <\Gf^\<G, for KWk^i and in ^\ (4.27) 
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1^12 - msc\ < Arf + CAl for allk^i and in ni (4.28) 

and 

\A,\<^ + CAl mill. (4.29) 



Here we have used that 



GkiGii 



< c-^Kl in 



with c being the constant in (4.25) and we also used that afj ~ 1. Similarly, with one more expansion 
step, we get 

maxmax|G^*^^| < CAo, maxmax |G^^^ | < C in rj^^ (4.30) 

ij k^l ij k 

and 

IGl^k ^m,,\<Ad + CAl ioT allk^i,j andmn%. (4.31) 

Using these estimates, the following lemma shows that Zi and Z^j''^ are small assuming + Ao is small 
and the /lij's are not too large. The control parameter for the Z's is $ = $(2), defined below (4.32). These 
bounds hold uniformly in S. 

Lemma 4.5 Denote by 
and define the exceptional events 



<,:=<,iz) = ^^^±^^^^±t±^, (4.32) 



:=^^max^|/i,,|>(logiV)2"|a,,| 



d{z) := {max|Z,(z)| > (logiV)5+2"$(z)} 



no{z) := <^ max|z|f' (z)| > (log A^)^+2"$(z) 
and we let 

17 := u y [(r2d(z) u r2o(z)) n vti{z) (4.33) 

to he the set of all exceptional events. Then we have 

P(0) < C'7V-'=0°gi°g^). ^4 34^, 

Proof. Under the assumption of (2.11), we have 

P(f7i) < CiV-'='°siogW^ (4.35) 

therefore wc can work on the complement set ilj. Define the event 

(log A^) -3/2-, 



nA{z) [kd{z)+ko{z) 



> 2- 
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Notice that the estimates (4.26)-(4.31) also hold on fi^, maybe with different constants C . We now prove 
that for any fixed z £ S, we have 



r!^(z)n |max|Z,(z)| > C(logiV)2+2"$(z)| ) < C7V"^'°siogW 



(4.36) 



and 



f[^l{z) n |m_ax|z|;-'')(z)| > (logiV)4+2"$(z)| ) < CiV-^i°gi°g^. (4.37) 
To see (4.36), wc apply the estimate (4.18) from the large deviation Lemma 4.4, and we obtain that 



|z,|<c(iog7V)i+2o L^^G 



(4.38) 



holds with a probability larger than 1 — C'iV^'^('°s'°s^^ for sufficiently large N. 

Denote by u^'' and Aa"* (a 1, 2, . . . , — 1) the eigenvectors and eigenvalues of iJ^'^ . Let (/) denote 
the l-th coordinate of Then, using af^ < l/M and (4.28), we have 



(0|2 



kk 



1 |ui')(fc)|2 , 1 

17 2^ 2^ 



M 



k^i a l-^a "* z\ 

Ad + CAI + 3m msc{z) 



< 



2 3mGi^(z) 
fe I 



A; 



in f2^« 



(4.39) 



Here we defined \A\'^ :— A* A for any matrix A and wc used (4.12) to estimate 3mmsc{z)- Together with 
(4.38) wc have proved (4.36) for a fixed z. 

For the ofFdiagonal estimate (4.37), for i ^ j, we have from (4.19) that 



|zf)|<C(log7V)3+2. J2 UkGir<Ji, 



kd^ij 



(4.40) 



holds with a probability larger than 1 — CN ^('°s'°s^) for sufficiently large A^. Similarly to (4.39), by using 
(4.31), we get 



kl ^IJ 



< in m 



This proves (4.37). 

Now we start proving (4.34). First we choose an A^~^°-net 3\f in the set S, i.e., a collection of points, 
{zn}nei C S, such that for any z £ S there is z S INT such that \z — z\ < N~^'^. The net can be chosen such 
that |/| < CTV^o. Then (4.36) and (4.37) imply that 



3z e :N-, s.t. holds andmax|Z,(F)| +max|z|f^(F)| > 2(logiV)''+'"$(F) ) < CiY-^iog'og 



- c log log 



(4.41) 
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Now let z £ S be arbitrary and choose z £74 such that \z — z\ < N '. For any fixed i ^ j, we have 



\z'^\z)\-\z'^^(z)\\<\z~z\rn^ 



(4.42) 



By dzfp/dz -- 



Y.sMm) ^'k^ks^^si'^^I and maxab |G|^;^^| < t? \ we have 



max 



5Z, 



(92 



-(0 



< 



in Vt\. 



In the last inequahty, we used the assumption r/ > iV ^. Thus 



< iV" 



in rj? 



Since $ > M^^f^-q^^/^ > cN'^ for z e 5, we obtain 

|z(;^'(z)|-|zf)(?)||<$(z) in Of, 
and exactly in the same way, we have 

\Z,{z)\ - |Z,(z)| I < in ni. 

Moreover, by estimating \dzG\ < N"^ in 5', we see that h.d{z), Ao(z), and ^{z) are Lipschitz continuous 
functions in 5 with a Lipschitz constant bounded by CN^ . Therefore $(z) can be replaced with $(2:) in the 
lower bound on \z[^p (z)\ and \Zi{z)\ obtained from (4.41), and, furthermore, ^%iz) C i^'}^(z) using a trivial 
upper bound 6{z) < N. Thus we get 



3z eSs.t. ni{z) and ill hold andmax |Z,(z)| + max (z)| > (log 7V)^+2"$(F) ) < CN 
Combining this with (4.35), we obtain (4.34) and thus Lemma 4.5. 



-c log log 



□ 



Our goal is to show that Ko{z)+Kd{z) is smaller than (Mr;)"^/^ (modulo edge and logarithmic corrections) 
for any z G in the event i¥'[z). We will use a continuity argument. In Lemma 4.6 we show for any z € S 
that if Ao(z) + Ad(z) is smaller than (logiV)"^/^, then it is actually also smaller than (M?7)~^/^. In Lemma 
4.9 we show that this input condition holds at least for 3m z = 77 = 10. Then reducing 77, we show by a 
continuity argument that it holds for each z € S. 

Lemma 4.6 (Bootstrap) Let z ~ E + irj and satisfy (4.2), in particular 2 € 5*. Recall Ad, Ao and n 
defined in (4.23) and (4.33). Then we have that, in the event fi^, if 



Ao(z) + Ad(z) < 



(l0giV)-3/2 



the 



have 



Ao{z)+Ad{z) < (logiV)6+2"I^l±|^0(z) 



/Mr] 



(4.43) 
(4.44) 
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and we also have a stronger bound for the off- diagonal terms: 

A.(z)<(logiVy^+^" ^"t^'^' . (4.45) 

Proof of Lemma 4-6. First note that condition (4.43) is equivalent assuming the event 51^ (^) and we have 

n-nni{z)cn'^,{z)uni{z), (4.46) 

so the event fl'^{z) U floi^) holds. We recaU from (4.12) that 

1 



< C^/J^:TTj <C, zeS. (4.47) 



With the assumption (4.43) we have (see (4.25), (4.27)) 



and by (4.47) 



c<\Gu\<C, c<\G^\<C (4.48) 



and thus, by (4.2) and (4.47), 



+ ^)"' < $(^) < c^Jl+^ < C(l0g7V)-12-3a^ (4 50) 



We first estimate the offdiagonal term Gij. From (4.14) we have 



\G^^ - |G.,||G«||iff 'I < C (%,\ + \Z\^>\] , z ^ J, (4.51) 



where we used (4.48). 

By the remark after (4.46) we have 



< '^(log^)^" _^ C(logiV)5+2"$ < C(logiV)5+2«$, 



il/ 

where we used (4.50) to show that the first term can be absorbed into the second. From the second inequality 
in (4.50) we also have 

Ao - max |G,,| < C(log iV)5+^° t^' \ (4.52) 

This proves the estimate (4.45). Using (4.47), we also see that (4.44) holds for the summand Aq. 

Now we estimate the diagonal terms. Recalling = Ai + ha — Zi from (4.21), with (4.29), (4.50), (4.52) 
we have, 

T = T(z) :^max|T,(z)| < C ^^'^^ " + C(log iV)^+^°^ in l]'^ n f7^(z). (4.53) 
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Again, the first term can be absorbed into the second, so we have proved 

T < (log iv)5+2"$ < c(iog N)-^ in rj'^ n ni{z). 

In the last step we used (4.50). 

From (4.22) we have the identity 



(4.54) 



G 



II '"'SC 



(4.55) 



Using {irisc + z) = —m^^, and the fact that \msc + ^1 > 1, so with A^^ + T < j^\msc + z\ (see in (4.49) and 
(4.54)), we can expand (4.55) as 



J2 4^J- - Tf^) + oU^ci + Tf) . (4.56) 



Summing up this formula for all i and recalling the definition v = u,; — m — riisc yield 



Introducing the notations C := to^^(z), T jt J2i '^i f^^' simplicity, we have (using A^ < 1) 



1-C 



c 



T + 0(Y^(A<i + T)^) =0 



c 



1-C 



(A^ + t; 



(4.57) 



Recall that E denotes the matrix of covariances, = 4, ^^'^ know that 1 is a simple eigenvalue with 
the constant vector e = iV~^/^(l, !,...,!) as the eigenvector. Let Q := I — |e) (e| be the projection onto the 
orthogonal complement of e, note that S and Q commute. Let || • ||oo-i.oo denote the l°° l°° matrix norm. 
With these notations, (4.56) can be written as 

^—^'^ Cj2^rjivj ~v)~ c(t. - t) + O (Id (A^ + T)) + o((A, + Tf), 
3 

and the error terms for each i sums up to zero. Therefore, with T < 1, we have 



-Ed 



cs 



CQ 



II 1 — lloo— fcx 

Combining (4.57) with (4.58), we have 
max \vi \ < C 



(T, -T) +0 



0(A^ + T). 



CQ 



1-CS 



(A^ + T) 



( 


CQ 




C 


) 




1 -cs 


+ 

00— >-oo 


1-C 





(A^ + T). 



(4.58) 



(4.59) 



To estimate the norm of the resolvent, we recall the following elementary lemma (Lemma 5.3 in [19]). 
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Lemma 4.7 Let (5_ > 6e a given constant. Then there exist small real numbers r > and ci > 0, 
depending only on S-, such that for any positive number S^, we have 



max 

a;e[-l+<5_,l-5+] 



< il-ciq{z)){l + Tf 



with 



q{z) := max{(5+, |1 — 5He m^^^{z)\}. 



Lemma 4.8 Suppose that E satisfies (2.4), i.e., Spec{QT,) C [— 1 + (5-, 1 — 5+\. Then we have 



Q 



< 



C((5_)logiV 



1 - ml^(z)'Z 

with some constant C((5_) depending on 5- and with q defined in (4.61) 

Proof. Let || • || denote the usual l"^ — > £^ matrix norm and introduce C, = m1^{z). Rewrite 

Q 



(4.60) 



(4.61) 



□ 



(4.62) 



1 - cs 

with T given in (4.60). By (4.60), we have 



1+T 



l + T 



1 + r 



■Q 



< sup 

a;e[-l+<5_,l-i5+] 



1 + r 



<(l-cig(z))i/2. 



To estimate the £°° — ^ norm of this matrix, recaU that \C,\ ~ \msc\'^ < 1 and J^j l^ul — ~ 
J2i <^ii = 1- Thus we have 



CS + T 



1+T 

To see (4.62), we can expand 
1 



CS + r 



+ T / jj 



< 



:^|CS.,+r%|<4^<l 



1 + r 



1 - 



1 + T 



7^ 



< 



oo— J-oo 



CS +T II" 



n<no 



CS + T\" 



Q 



I CX3— J-OO 



<710 + VA^ 51 |K\ 



CS + r 



no 



TV 5] (1 ~ ci(z(z))' 



/2 



n>7io 



= no + CvN — < — — . 

q{z) q(z) 

Choosing no = C log N/q{z) with a large C, we have proved the Lemma. 



□ 



We now return to the proof of Lemma 4.6, recall that we are in the set il"^ H r2^(z). First, inserting (4.5) 
and (4.62) into (4.59), and using 1/q < 0, we obtain 

Ad = max \v,\ < C9{z)iA^a + T) log TV. 
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By the assumption (4.43), we have Cd{z)Ad\ogN < 1/2, for large enough A^, therefore we get 

Ad < C0{z)TlogN. 
Usmg the bound on T in (4.54) and (4.50), we obtain 

Ad<C0(z)(logiV)«+2"i^±^, 

which, together with (4.52), completes the proof of (4.44). q 

Lemma 4.9 (Initial step) Define 

Off := > 3}, 

recall the definitions offli, fid and flo from (4.33) and define 

:= Off U Oi U IJ |t7o(z) U fldiz) : z = E + lOi, \E\ < 5|. (4.63) 

Then we have 

Furthermore, in the set 51"^ we have 

(1ok7V)-3/2 

Ao{z)+A,{z)<^-^^ (4.65) 

for z = E+lQi, \E\ < 5. 

Proof. The exceptional event fiff is controlled by Lemma 7.2 of [19]. For convenience, we will recall this 
result in Lemma 6.2, Eq. (6.11), and we note that the condition of this lemma, M > (log A^)^, is implied by 
(4.2) and (4.12)). Thus we have P(Off) < CN-<^°siogN) _ 

Denote by Ua and the eigenvectors and eigenvalues of H. On the set fl'jj all eigenvalues are bounded, 
|Aa| < 3. In this set we have, with \E\ < 5, 

3mG,, = vT. n '""^^2' 2 > - E = - (4-66) 

a. ^ ^ a 

with some positive constant c > 0. We also have the upper bound \Gkk\ < ^-^d + < C/rj. In 
particular, for 77 = 10, we have 

c<\Gkk\<C, innj,, (4.67) 

with some positive constants. Inspecting the proof of Lemma 4.5, notice that the restriction to the set fi^ was 
used only to obtain the estimate (4.25). Once this estimate is obtained independently, as in (4.67) in the set 
ri^, all the estimates (4.26)"(4.31) hold and these are the necessary inputs for Lemma 4.5. Thus, following the 
proof of (4.36)-(4.37), and replacing ^1% with 17^, we obtain that P{r2|^ n (Oo(z) U 17d(z))} < C'iV-^i°gi°s^ 
for each fixed z = E + lOi, \E\ < 5. Finally, this estimate can be extended to hold simultaneously for all 
z = E + lOi, \E\ < 5 using an N~^°-net as for the proof of (4.34). This proves (4.64). 
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Similarly, the argument (4.51)-(4.52) shows that in the set W^, we have 



C(log^)^ ^^^^^Q^ (4_g8N 
y/M 



and the argument (4.53)-(4.54) guarantees that 



C(log7V)5+2^ 

T(z) < z= , z = h + lOi, (4.69) 

^/M 



in ri"^. Finally, to control A^, we use that from the self consistent equation (4.55) and the definition of ttIsc, 
we have 

For rj = 10, with (2.9), we have \z + msc{z)\ > 2. Using \Gii\ < rj^^ = and \msc\ < rj^^ = j^, we obtain 

< < 1/5, l<i<N. (4.71) 

Using (4.69), together with \z + msc{z)\ > 2 and (4.71), we obtain that the absolute value of the r.h.s of 
(4.70) is less than 

0(T). (4.72) 



sup, \v^\ 



\z + msc{z)\ - sup,; \vi\ 
Taking the absolute value of (4.70) and maximizing over n, we have 

Ad = sup|«„| < — -^^^ T- + 0(T). (4.73) 

„ \z + msc\-Ad 

Since the denominator satisfies \z + msc{z) \ — supj \vi\ > 2 — 1/5, 

Ad < CT (4.74) 

follows from the last equation. Combining it with (4.68) and (4.69), we obtain (4.65), and this completes 
the proof of Lemma 4.9. q 

Proof of Theorem 4-1- Lemma 4.6 states that, in the event il^, if Arf(z) + Ao(z) < R{z) then Kd{z) + 
Ao(z) < S{z) with 

R{z) (log7V)-3/2(0(z))-i, S{z) := (logjV)6+^° ^^t^'^% (z). 

By assumption (4.2) of Theorem 4.1, we have S{z) < R{z) for any z G 5 and these functions are continuous. 
Lemma 4.9 states that in the set 17^ the bound Ad(z) + Ao(z) < R(z) holds for ?; = 10. 

Thus by a continuity argument, Ad{z) + Ao(z) < S{z) in the set fi'^ n fl'^ as long as the condition (4.2) is 
satisfied. Finally, once Ao(z) < S{z) is proven, we can use S{z) < R(z) (in the domain D) and Lemma 4.6 
once more to conclude the stronger bound on Ao(z). This proves Theorem 4.1. 

We record that combining the bound on A^, Aq with (4.54), we also proved that under the assumption 
(4.2) we have 

Ad(z) + A„(z) + T(z) < C(log iV)^^+^° ^'^ e{z) inll^nf}^ (4.75) 



□ 
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5 Local semicircle law 

In this section wc strengthen the estimate of Theorem 4.1 for the Stiehjes transform m{z) ~ Gu. The 

key improvement is that \m — msc\ will be estimated with a precision {Mr])~^ while the \Gii — msc\ was 
controlled by a precision (Mt])^^^^ only (modulo logarithmic terms and terms expressing the deterioration 
of the estimate near the edge). 

Theorem 5.1 Assume the conditions of Theorem 4-1 cind recall the notations k — ke ■— \ \E\ — 2| and 6{z) 
from (4.1). Define the domain 

D* ■.^\^z = E + irjeC : \E\<5, ^ < rj < 10 , Mt] > {\ogN)^*+^''e^{z)iK + rj^^^}. (5.1) 
Then for any e > and K > there exists a constant C ^ C{e, K) such that 

U {|m(z) - m.e(z)| > -^\) < (5.2) 

Proof of Theorem 5.1. We will work in the set fi'^ D Qf^, which has almost full probability by (4.34) and 
(4.64). Note that the set D* is included in the domain defined by (4.2), therefore we can use the estimates 
from Section 4. 

As in (4.57), where v — m{z) — msc{z), we have that 

m-msc = - E + 0{t^^^^ + 

i 

holds with a very high probability. Recall that C = m'^^(z) and we mostly omit the argument z from the 
notations. The quantities A^, and T were defined in (4.23), (4.21) and (4.53). Then with (4.75) we have 

<^)-(r4,^i:T,).o(^^ 

holds with a very high probability for any small £ > 0. Recall that T, = A, + hu — Zi. We have, from (4.20), 
(4.25) and cr^- < 

^ - M ° - Mr] ' 

where we used (4.75) to bound Ao and (4.47) to control the C/M term. 
Wc thus obtain that 



holds with a very high probability. Since ha^s are independent, applying the first estimate in the large 
deviation Lemma 4.4, we have 



(|ii:^..h(.o.A')-«^) 
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On the complement event, the estimate (\ogN)^^'^~^°'{MN) ^1"^ can be included in the last error term in 
(5.3). It only remains to bound 



1 " 



TV 

whose moment is bounded in the next lemma which will be proved in Sections 8 and 9. 
Lemma 5.2 For fixed z in domain D* (5.1) and any even number p, we have 

N 

<Cp {{log N)'+^^X^Y 



E 



for suffieiently large N , where 

X = X{z) ■.= {\ogN) 



i=l 



/Mrj 



z = E + ir), K = \\E\ - 2|. 



(5.5) 



(5.6) 



Using Lemma 5.2, we have that for any e > and K > 

N 




Mri 



for sufficiently large N . Combining this with (5.4) and (5.3) and noting that |1 — CI ~ V* + 'H^ see (4.7), we 
obtain (5.2) and complete the proof of Theorem 5.1. 

□ 

6 Empirical counting function 

In this section we translate the information on the Sticltjes transform obtained in Theorem 5.1 to an asymp- 
totic on the empirical counting function. The main ingredient for the first step is the following lemma based 
upon the Hclffer-Sjostrand formula. We will formulate this lemma for general signed measures, but we will 



apply it to the Stieltjes transform m 



rUsc of the difference between the empirical density and the 



semicircle law. A similar statement was already proven in Lemma B.l in [15] and Lemma 7.7 in [19]. 

Lemma 6.1 Let be a signed measure on the real line with supp C [—K,K] for some fixed constant 
K > A. For any Ei, E2 G [—3, 3] and rj £ (0, 1/2] we define /(A) = fEi.E2,ri{^) to be a characteristic function 
of [Ei,E2] smoothed on scale rj, i.e., f = 1 on [Ei,E2], / = on M \ [Ei — rj,E2 + rj] and |/'| < Crj~^ , 



I /"I < Cri~^. For any a; € M, set 



2 . Let m be the Stieltjes transform of g . Suppose for some 



positive U , and non-negative constant A we have 

I A / \ I CU 

\m {x + iy) \ < 



for l>y>0, |xi<A' + l, 



y{Kx + y)^ 

and in case of A > we additionally assume < ^ niin{K£;j, kej}. Then 

CU\logrj\ 



fEl.E2.7,{>^)Q'^{>^)dX 

with some constant C depending on K and A. 



< 



[mm{ KE I, KE2}y 



(6.1) 



(6.2) 
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Proof of Lemma 6.1. For simplicity, we drop the A superscript in the proof. Analogously to (B.13), 
(B.14) and (B.15) in [15] we obtain that (with / = fEi,E2,v) 



fiX)g{X)dX 



< C i\f{x)\ + \y\\f'ix)\)\x'{y)\\mix + iy)\dxdy 



+C 
+C 



\y\<'i 



\y\>n 



y f" {x)x{,y)'^vcy m{x + iy)dxdy 



y f" {x)x{y)'^m m{x + iy)dxdy 



(6.3) 



where xiv) is a smooth cutoff function with support in [—1, 1], with xiv) — 1 for \y\ £ 1/2 and with bounded 
derivatives. The first term is estimated by, with (6.1), 



{\fix)\ + \y\\f{x)\)\x'{y)\\m{x + zy)\dxdy < CU. 

For the second term in r.h.s of (6.3) we use that from (6.1) it follows for any 1 > y > that 

CU 



y\3mm{x + iy)\ < 



(AC. + ' 



With |/"| < C7?-2 and 
we get 



supp/'(a;) C {Ix - -Ell < 7]} U {|a; ~ E^] < t?}, 

CU 

second term in r.h.s of (6.3) < 



(6.4) 

(6.5) 
(6.6) 



Ym{KEi,KE2}]' 



As in (B.17) and (B.19) in [15], we integrate the third term in (6.3) by parts first in x, then in y. Then 
we bound it with an absolute value by 

c( v\fix)\\Dlemix + iT])\dx + c[\f{x)x'iy)^(:rn{x + iy)\ + —[ [ m{x + iy)\dxdy . 

J\x\<K+i Jm^ ''^ Jv<y<^J\^-E\<v 

(6.7) 

The second term is bounded in (6.4). By using (6.1) and (6.6) in the first term and (6.1) in the third, we 
have 



(6.7) < 



CU 



< 



[min{KEi,KE2}y 
CU\logr,\ 

[inm{KEi,KE2}y 



CU + CUrj-^ V / dx [ — 



v) 



Ady 



□ 



Let Ai < A2 < ... < Ajv be the ordered eigenvalues of a universal Wigner matrix. We define the 
normalized empirical counting function by 



<E) := 1#{A, < E} 



(6.8) 
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and the averaged counting function by 



niE) = ^E#[X, < E]. (6.9) 



Finally, let 

/E 
gsc{x)dx (6.10) 
-oo 

be the distribution function of the semicircle law which is very close to the counting function of 7's, {E) :~ 
^#[7j < E]. 

We will need some control on the spectral edge, we recall the Lemma 7.2 from [19]. 

Lemma 6.2 (1) Let the universal Wigner matrix H satisfy (2.1), (2.2) and (2.11) with M > (logiV)^. 
Then we have 

n(-3) < CA^-'^iogiog^ and n{3) > 1 - cat- ^ i°g i°g (6.11) 

(2) Let H be a generalized Wigner matrix with subexponential decay, i.e., (2.1), (2.2), (2.6) and (2.11) 
hold. Then , , 

n{-2 - iV-i/6+e) < (j^~N^ and n{2 + A^-Ve+e) > ^ _ (j^-n- ^ ^g^^2) 

for any small £ > with an e' > depending on e. Furthermore, for K > 3, 

n{-K) < e"^'i°g^ and n{K) > 1 - 6-^''°^^:^ 

for some £ > 0. 

With these preliminary lemmas, we have the following theorem that we state for universal Wigner matrices 
and for their subclass, the generalized Wigner matrices in parallel. 

Theorem 6.3 Let A ~ 2 for universal Wigner matrices and A ~ 1 for generalized Wigner matrices. Suppose 
that the universal Wigner matrix ensemble satisfies (2.1), (2.2) and (2.11) with M > (logiV)^^^^" and the 
generalized Wigner matrix ensemble satisfies (2.1), (2.2), (2.6) and (2.11). We recall M — N in the latter 
case. Then for any £ > and K > 1 there exists a constant C{e, K) such that 

CiV^T C{e,K) 



4 sup \n{E) - nsc{E)\ < ^| > 1 



\E\<i 

where the n{E) and ng(.{E) were defined in (6.8) and (6.10) and ke — \ \E\ ~ 2|. 

Proof. For definiteness, we will consider the case of generalized Wigner matrices, i.e., A=\. In this case 
M = N, 6+ > Cinf > (see (2.7)) and thus 0{z) < C{k + r])-^/^ for \z\ < 10, see (4.10). For simplicity 
of the presentation, we assume that 9{z) = {k + v)^^^^ as overall constant factors do not matter (see the 
remark after (4.10)). We set ry = 1/A^, U — N^^-^ and apply Lemma 6.1 to the difference m'^ = m — rUsc- 
Let g'^ ~ g — gsc, where g{x) = S{x— Xj) is the normalized empirical counting measure of eigenvalues. 

First we check the conditions of Lemma 6.1. To check that (6.1) holds, set L = (log Af)^^+^" and for a fixed 
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X, let Ux satisfy Ny^iKx + VxY^"^ = L, so that x + iyx € D* . Clearly (6.1) holds for any y > yx with a very 
high probability by (5.2). In particular, we know that 



\m{x + iyx) - nisc{x + iyx)\ < 



Consider y < yx, set z = x ^~ iy, Zx — x + iyx and estimate 



CU 



Vxiiix + yx) ' 



\m{z) - msc{z)\ < \m(zx) - msc{zx)\ + / \d^{m{x + ii]) - ruscix + iri))\dri 



Note that 



\drjm{x + ir])\ =\j^^d^Gjj{x + iT]) 

j 

<^^\Gjk{x + iv)\^ = + = ^3mm(x + z77), 



(6.14) 
(6.15) 

(6.16) 
(6.17) 



and similarly 



\dnmsc{x + iT])\ = 



Qscis) 



(s — x ~ irj) 



:ds 



< 



^'''^^ ^ ;ds = —3mirLsc{x + ill). 



\s ~ X — iri[ 



Now we use the fact that the functions y — > y3m.m(x + iy) and y — > y3mm.sc{x + iy) are monotone increasing 
for any y > since both are Stieltjes transforms of a positive measure. Therefore the integral in (6.15) can 
be bounded by 



3mm{x + irj) + 3mmsc{x + ii])] < yx [3mm{x + iyx) + 3mmsc{x + iyx)] 



By the choice of yx and using that 3m msc{zx) < C^Kx + yx, we have 

jm niscyzx) < 



Vx{Hx + Vx) 



(6.18) 



(6.19) 



and then 3m Em(zj.) can be estimated from (6.14). Inserting these estimates into (6.15) and (6.18), and 
using (6.14), we get 



\m{z) - msc{,z)\ < \m{zx) - mscizx)\ 



CU yx 



< 



CU 



yx{nx + Vx) y y{i^x + y) 



with a possible larger C in the r.h.s. Thus (6.1) holds for the difference ~ m — nig 
The application of Lemma 6.1 shows that for 77 = 1/A^ 



/Ei,B2,r,(A)g(A)dA - / /£;i,_E2,,,(A)gsc(A)dA 



< 



CN 



2e 



N min{K , } + 1 



(6.20) 



Recall that fEi,E2.i-i the characteristic function of the interval [Ei,E2], smoothed on scale 77 at the edges. 
The additional 1 in the denominator in the r.h.s. of (6.20) comes from the case when ke^, i^E2 a-re very 
small and the trivial estimate / < 1 with J g = J Qsc ~ 1 gives a better bound than Lemma 6.1. 
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With the fact y — > y3mm{x + iy) is monotone increasing for any ?/ > 0, (6.19) imphes a crude upper 
bound on the empirical density. Indeed, for any interval J := [x — 77, a: + 77], with 77 = 1/-^, we have 



CN' 



n{x + 77) — n{x — if) < Ci] 3m m{x + irj) < Cyx 3m 771(2; + iy^) 

since 7; — l/N < y,j. for any x. 

Choose arbitrary £'i,i?2 G [—3,3], then we have 

n(i?i) - n{E2) - I fE^^E,,^{\)g{X)'i\\ <C ^ [n{E, + 77) - n(i?, - 7? 



.21) 



i=i,2 



from (6.21). Since is bounded, we also have 

nsc{Ei) - UsaiE^) - / fE,^E,,r,WgsciX)dX < Crj = C/N. 



3.22) 



3.23) 



Subtracting (6.22) and (6.23) and using (6.20), wc obtain that for any Ei,E2 G [—3,3] 



[n{Ei) - n{E2)] - [n,c{Ei) - n,c{E2)] 



< 



N iiim{KEi, KE2} + ^ 



with a very high probability, i.e., apart from a set of probability smaller than C{£, K)N~^ for any K. The 
estimate (6.13) from Lemma 6.2 on the extreme eigenvalues shows that g is supported in [—3,3] with very 
high probability, i.e., n(— 3) = 77,sc(— 3) = 0, n(3) = 7isc(3) = 1. Thus wc obtain that 



n(£;) - n,,{E) 



< 



CN 



2e 



Nke + 1 



(6.24) 



holds for any fixed E € [—3, 3] with an overwhelming probability. 

We now choose a fine grid of equidistant points Ej £ [—3, 3] with \Ej — Ej+i \ < N^^, then (6.24) holds 
simultaneously for every E — Ej with an overwhelming probability. For any E G [—3, 3] we can find an Ej 
with \E - Ej\< iV-i and by (6.21) wc obtain 



\n{E) - n{Ej)\ < n{Ej + l/N) - n{Ej - l/N) < 



CN 



2e 



Nke, + 1 



This guarantees that (6.24) holds simultaneously for all E. Since e > was arbitrary, this proves Theorem 
6.3 for generalized Wigncr matrices. 

The proof for universal Wigner matrices is very similar, just M replaces TV in the estimates, U — N^M""^ 
and instead of 9{z) < C(k + 77)~^/^ one uses 9{z) < C{K, + r])~^ which follows from (4.10). The main technical 
estimate (6.20) is modified to 



(A)e(A)dA - / fE, .E2,ri 
and the rest of the proof is identical. 



< 



M [ min{ , ke^ }] +1 



3.25) 
□ 
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7 Location of eigenvalues 



In this section we estimate the mean square deviation of the eigenvalues from their classical location. The 
main input is Theorem 6.3, the estimate on the counting function. For simplicity, we consider only the 
case of generalized Wigner matrices. Similar, but weaker results can be obtained along the same lines for 
universal Wigner matrices. 

Theorem 7.1 Let H be a generalized Wigner matrix with subexponential decay, i.e., assume that (2.1), 
(2.2), (2.6) and (2.11) hold. Let Xj denote the eigenvalues of H and jj be their classical location, defined by 
(2.23). Then for any Sq < 1/7 and for any K > 1 there exists a constant C, depending on K and Eq, such 
that 

N „ 

^{Y.\>^,-1,?<N-^^']>1~^. (7.1) 

and 

N 

5]E|A,-7,f <CiV--«. (7.2) 

Proof. The proof of (7.2) directly follows from (7.1) by using the estimates on the extreme eigenvalue 
(6.13) from Lemma 6.2. For the proof of (7.1), we can assume that maxj |Aj| < 2 + N~^/'^ since the 
complement event has a negligible probability by (6.12) and (6.13) of Lemma 6.2. From Theorem 6.3 we 
can also assume that 

\n{E)-n.,{E)\<^ (7.3) 

holds for every £ R. 

From the definition of 7^ it follows that for j < N/2, i.e., 7^ < 0, 

2/3 / 7 \2/3 



C.{jj) <7.<-2 + C.(A) (7.4) 



with some positive constants Ci, C2. 



Choose /? = f - £. Consider first those j-indices for which CqN^^^^/'^ < j < N - CqN^^^^^'^ with 



a 



sufficiently large constant. We choose Co so that (7.4) would imply -2 + 2iV < 7j < 2 - 2N We then 
claim that 

Xj e[-2 + N-^,2- N-^] for CqN^"^'^^'^ < j < N ^ CoN^-^f^^^ . (7.5) 

We will show that Xj > —2 + N^^, the upper bound is analogous. Suppose that Xj were smaller than 
-2 + N-^, then n(-2 + N-''^) > j. On the other hand, nsc{-2 + 2N^P) < j and thus 

n,,{-2 + N-^') = nsc{-2 + 2N-P) - / Qsc{x)dx <j~ cN-^^'^ 

with some positive constant c. Therefore 



where the second inequality follows from (7.3), but this contradicts to the choice /3 
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Let j satisfy CqN^'^'^/'^ < j < N/2; the indices N/2 <j<N- CqN^-^^/"^ can be treated analogously. 
Note that Xn/2 < CN~^^^ by (7.3). Define c{j) to be index of the 7-point right below Xj, i.e., 

7c(i) < Aj < 7c(i)+i- 

By (7.5) wc sec that -2 + ^N^f' < -f^U) < CN-^+^ and from (7.3) and (7.4) it follows that 

\c{j)'j\<— <CN^+^. (7.6) 

2 + IcU) 

By the choice of /3 we have e + /3 < 1 — i.e., (7.6) implies |c(j) — j\ <C j. Using now (7.4), we have 

\c{])-3\< —< ,..,,3 < .,,3 . 7.7 

Finally, we can estimate 

r^' ... ,/ 7 \i/3 



W)-]\^n\ / e,,(a:)da;| >CAr|T,,(^.)-T,^.|(2 + 7,)i/2>C7V|7,(,.)-7,|(^) , 



using |c(j) — j| ^ j and hence (2 + 7^) and (2 + 7c(j")) are comparable. In the last step wc also used (7.4). 
Combining this with (7.7), we have 

\lcij) -lj\< g ^2/3jl/3 ^ • 

and the same estimate holds for |7c(j)+i — 7j| and thus 



|A,-7,| < 



3 

as well. Therefore 

I A, - 7j I' < C7V2^'-l+3^/2 < ^^-2/5+e/2 (7 

CoAri-3/3/2<_,<jv/2 

by the choice of (3 and similar estimate holds for the sum over the indices N/2 < j < N — CoiV^~^^/^ as 
well. 

Now we consider the indices j < CqN^~^^/'^ and Aj > — 2 — N^^ . By a similar argument that proved 
(7.5), wc can see that there is a constant C3 such that Xj < —2 + C^N~^ , otherwise n(— 2 + C^N^^) < j, 
but nsc{-2 + CzN-l^) >j + ciV"^''/^, which would contradict (7.3). It is easy to see that < -2 + CN^I^ 
for all j < CoN^~'^^/^ , therefore in this regime wc estimate |Aj — 7j| < CN~^ and thus 

^ |A, - 7,f 1(A, > -2 - N-l") < CoiVl-3/3/2(CAr-/3)2 < ^^-2/5+7e/2_ (7 9) 

The indices j >N - CqN^^^^^^ and A^- < 2 + iV"^ can be treated similarly. 
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Finally we deal with the extreme eigenvalues \j < —2 — N ^ with index j < CqN^ '^^/^ and we can 
assume that Xj > —2 — N~^^'^ . For these indices —2 < < — 2 + CN~^ and we can estimate 

|A,-7,|<C|A,+2|. 

For any a with < a < N~^^'^ , we have nsc{—2 — a) = 0, thus we obtain from (7.3) that 

n(-2-a)<^. 

Therefore 

\Xj - 7jf l(-2 - iV-i/^ < Aj < -2 - N-l^) <CJ2 |Aj + 2|2l(-iV-i/7 < A^- + 2 < -iV"^) 

<C / a da 

" Jo Na 

<CN-^l''+\ (7.10) 

The other extreme eigenvalues, Aj > 2 + N^^ , are treated analogously. 

Combining (7.8), (7.9) and (7.10) and choosing e sufficiently small in the definition of /3, we proved (7.1) 
with any eo < 1/7. □ 

8 Moment Estimates of Error Terms 

In this section we prove the second and fourth moment estimates of Lemma 5.2; the general cases will be 
proved in Section 9. 

Definition 8.1 Define the operator lEi as 

IE.i=I-Ea., (8.1) 

where I is identity operator. 

Recall the definition of Zi, which we rewrite as 

= IE,Z«, Z« = E ^G'ilai = • GWa\ (8.2) 

We first prove a bound on the Green function Gj^'j . 

Lemma 8.1 Recall the definition of X in (5.6). Let t be any fixed positive integer, T = {ki, ^2 . . . kt} G N*, 
1 < ki < N for any 1 < i < t. Then there exists a constant Ct, depending only on t, such that for any 
z e D* in (5.1) in the set U,'^ (4.33), we have 

max |g|P(z)| < CtX{z), (8.3) 

max |Gi^^(z)-m,,(z)| < CtX(z)0{z) (8.4) 
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and for some constant c, C independent of t, 



c < min |Gf^)(,)| < max |Gi?(^)| < C, (8.5) 



for sufficiently large N . 



Proof Consider first tire case t = 0. Let Y denote the event inside the probability in the equation (4.3). The 
proof of Theorem 4.1 yields that 51"^ C V^. It is clear that (8.4) holds in the event and this proves (8.4) 
in Qf^ in the case t = 0. Similarly, in the case of t = 0, we can prove (8.3) using the event in the equation 
(4.4). By definition of the domain Z?*, the right side of (8.4) is o(l) and this proves (8.5) in the case t = Q. 
For the case t = 1 and ii — i, using (4.15) and (4.16), we obtain that 

\G\'^\<\Gik\ + \GuG,k\\Gu\-\ (8.6) 

l^ifc ~ < \Gkk - ruscl + \GkiGik\\Gu\~^ . 

Since X'^ X in D* , (8.4) and (8.3) in the case t = I follows from (8.6) and the case t = 0. Repeating this 
process, we prove (8.4) and (8.3) for any t > 1 hy induction on t. |— I 

Now we return to the second and fourth moment estimates of Lemma 5.2. 



8.1 Proof of Lemma 5.2 for p = 2. 



Now we prove the special case of Lemma 5.2 for p = 2. The second moment of X^iLi given by 



N 



(8.7) 



We start with estimating the first term of (8.7) for a = 1 and /3 = 2. The basic idea is to rewrite g''^] as 



G 



(1) 

kl 



P, 



kl 



P, 



(1),(2) 



kl 



k,l^l, 



with P, 



(1),(2) 



independent of a^. and Pj.\^''^ independent of a^. The P's have two upper indices. The first 
one refers to the fact that it comes from the H^^'> minor (i.e. follows the upper index of G*-^') and the second 
one indicates the additional independence. 

To construct this decomposition for kj ^ {1,2}, by (4.15) or (4.16) we can rewrite G^^j as 

^(i)r^(i) 



kl 



^(1) _ ^(12) , G^^G^i 
'-'kl - '-'kl 



G 



(1) 

2 2 



k,li {1,2}. 



The first term on the r.h.s is independent of a^. With Lemma 8.1, we have that the bound 



"^fc2'-^2; 



g: 



(1) 

22 



< CX'' 



(8.9) 



(8.10) 



holds with a very high probability. 
Next we define P'^' for (fc, / ^ 1). 
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1. Uk,l^ 2, 



2. If fc = 2 or Z = 2, 



P, 



(1),{2) 



^(12) p(l),l 



G 



(1) 

22 



(1),(2) _ 



0, P, 



(i),0 _ qW 



kl 



kl 



Hence (8.8) holds and P^^i'^"^^ is independent of s? . 

With this convention, we have the foUowing expansion of Zi 

Zi = lEiai • p(i)'(2)ai + lEiai • P^D'^a^. 

Lemma 8.2 For < J] < 10 and fixed p G N, we have the following estimates 

E a^ • P(i)'0ai 



E 



p(l),(2)jjl 



< {{log Nf+^^Y X^P, 
" <Cp {{log Nf+^'^Y XP. 



Since X^ < X in D* , this lemma also implies that 

E\Z,f <Ck {{log Nf+^°'Y l<i<iV. 
Proof. First we rewrite ai . p(i).0ai as follows 



1~ / ^ h O ^ 



A: 2 ^ 2 i 



G. 



(1) 



k,l^2 \ ^22 / fc5!t2 

By the large deviation estimate (4.19), we have 



) a? + 5: 4Gi^)a^ + aiG^a,^ + a^G^a^ 

/ kit2 1^2 



k,l^2 



fe I ^(1) 



G 



22 



> C{logNf+'^°'X^ < Ar-=i°gi°gJV_ 



Similarly, from (4.17), using Oi as a^, a\, . . . , ajy and keeping fixed, we have 



E 

k^2 



%'-'fe2^2 



> G(logiV)3/2+"x|a^| < Ar-=iogi°gJV^ 



(8.11) 
(8.12) 

(8.13) 

(8.14) 
(8.15) 

(8.16) 
(8.17) 



(8.18) 



(8.19) 



By (4.35), ||ai||oo < (log7V)2"M-i/2 holds with a very high probability. We can thus replace jaj] by 
(log7V)2"M-i/2 in (8.19). The third term in (8.17) can be estimated in the same way, and the last term can 
be bounded by (log A'')^"-^. with very high probability. 
Since rj < 10, by the definition of X in (5.6) we have 

X^ >C{logNy/M. (8.20) 
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Thus 



and wc have proved that 



(log7V)3/2+3"^ + (iogAr)4"J_ < C(log7V)3+2"x2, 
v il/ M 



< C{\ogN) 



3+2q -1^2 



> 1- AT 



— c log log A'' 



..21) 



This incquahty imphcs the desired inequahty (8.14) except for the contribution from the exceptional set 
where (8.21) fails. Since all Green functions are bounded by 7y~^ < A^, the contribution from the exceptional 
set is negligible and this proves (8.14). Finally, a similar proof yields (8.15). q 



Exchange the index 1 and 2, we can define p(2),(i) g^j^^j p(2),0 g^j^^j expand Z2 as 



Here ^^^''^^^ is independent of and a^; P^^'^'i''" is independent of a^. Combining (8.22) with (8.13), we have 



.1. p(2) 



5.22) 



EZ1Z2 



E 



^.23) 



(8.24) 



^1 {ai • P{i)^(2)ai + ai • P(i).0ai}) (1E2 {a^ • pi^)^Wa^ + • P^'-'>^^a'}J 
The only non-vanishing term on the right-hand side is 

E (lEiai • P(i).0ai) (lEza^ • p(2),0a2) . 
By the Cauchy-Schwarz inequality and Lemma 8.2, we obtain 

IEZ1Z2I < C((logiV)3+2")'x^ (8.25) 
Similarly, Lemma 8.2 and (8.20) imply that 

El^il^ < C((log7V)3+2")2x2 < CA^ ((logiV)3+2")^x''. 

Since the indices 1 and 2 can be replaced by a 7^ /3, together with (8.7) we have thus proved Lemma 5.2 for 
p = 2. 



8.2 Proof of Lemma 5.2 for p = A 

Now we prove the special case of Lemma 5.2 for p = 4: 



N 



Y^z. 



< CN-^ ^ \E ZaZf}Z^Z^\ 

1<Q<;3<X<7<A' 

\E\Z^\^'ZpZ^\ + ... 

l<a<P<x<N 

+CN-^ Y (E|Z„|2|Z^|2 + |E|Z„|2Z„Z^| 

i<Q<;g<Af 

l<Q<Ar 



.26) 
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Here . . . means the permutation of the ordered indices and the complex conjugate operators. We are 
going to compute the first two terms in the r.h.s of (8.26). The other two terms can be treated analogously. 
By the permutation symmetry of the indices, we can assume that q; = 1, /3 = 2, x — 3 and 7 = 4. As in the 
estimate for the second moment, the key idea is to decompose z[\'' in a suitable way: 

Lemma 8.3 There exist two decompositions of z[\^ 



TC{2,3} TC{2,3,4} 

such that Q^^')'^'^') and are independent of the rows in T U {1}, i.e., 

9(ai •O(i)'mai) _ a(ai •Q(i)^mai) 



da] da] 



0, i e T C {2,3}, l<j < N. 



..27) 



(8.28) 



d (a^ ■ j^(i)'(^)ai) _ d (ai ■ i?(i)-(^)ai) 



aa!- 



0, ieTc {2,3,4}, 1 < i < iV. 



^.29) 



Furthermore, the decompositions can he chosen in such a way that for all N ^ < rf < 10 the following 
estimates hold: 



and 



E 



E 



ai • Qti^'Wa^ 



< Cp {{log Nf+^"Y {X^'\'^\)P, peN 

< Cp {{log N)'+^^Y{x^-m)p, p&n. 



(8.30) 



.31) 



We postpone the proof of this lemma and first finish the proof of Lemma 5.2 in the case of p = 4. It is 



(2) 

clear that Lemma 8.3 holds for different index combinations. E.g. Z22 can be decomposed as 



.(2) 



^ a^ . i?(2).(T)a2 

TC{1,3,4} 



.32) 



and i?'^^)'s have the same properties (except for the exchange of 1 and 2) as i?*-^-* in (8.29) and (8.31) . By 
this property, we can estimate the first term on the r.h.s. of (8.26) by 



E (lEiZj}^ 



lE^zif 



5.33) 



< 



lEi 



TiC{2,3,4} 



T2C{1,3,4} 



■...i?(3).(T3)...j|-...^(4),(T4)...- 



Consider a term consisting of products of factors with nj=i_2,3,4(Tj U {j}) 7^ 0. Then there is an element £ £ 
{1, 2, 3, 4} in the common intersection so that integration w.r.t. the row a^ vanishes. Hence the nonvanishing 
terms consist of products of term with nj=i^2.3,4(irj U {j}) = 0, i.e., Uj=i.2,3.4(Tj U {j}y = {1, 2, 3, 4}. Here 
the notation means the complement in {1, 2, 3, 4}. Thus we have 

4 4 
^(4-|T,|-l)>4^^4-|T,|>8. 
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Using (8.31) and Schwarz inequality, we have thus proved that 



^(2) 



< 



C((log7V)3+2")^X^ 



E 



We now estimate the second term in r.li.s ol (8.26). 

2 



miZ[\^ (lIE2^22^) (™3^^3^) = E lEi ^ ai • g(i).(To)ai 



ToC{2,3} 




1 E ^' Q 

TiC{2,3} 



(l).(Ti)al 



IE2 E ■ 0^^^'^^^^a2 

T2C{1,3} 



IE3 E a-^ • Q(3),(T3)a3 

T3C{1,2} 



.34) 



Consider a term consisting of products of factors with [(1^=2,3 (Tj U {j}] n Tq n Ti 7^ 0. Then there is 
an element £ G {2, 3} in the common intersection and the integration w.r.t. the row vanishes. Thus 
the nonvanishing terms consist of products of term with [(1^=2,3 (Tj U {j}] n Tq n Ti =0. In particular, 
{2,3} C Uj=2,3(TjU{j})'=U[{2,3}\To]U[{2,3}\Ti]. Here the notation '= means the complement in {1,2,3}. 
Thus we have 

3 3 
E(2-|T,|)>2=^E3-Iir.l>6. 

Using (8.30), (8.20) and a Schwarz inequality, we have 



E 



lEiZ[l^ 



7(2) 



^2^22 



[E3Z. 



(3) 



3^33 



<§((logA^)'+'")'^'«C((logiV) 



3+2q\4 ^8 



For the other terms in (8.26), we can just use Schwarz inequality and (8.16). We have thus proved the 
Lemma 5.2 for p = 4. 

We now prove Lemma 8.3. First we prove the properties of Q's. Notice that the decomposition with Q's 
in (8.27) removes the dependence on rows 2,3. The starting point is an expansion of G^^^'' 

^(1) _ _ ^n(i).(2) , n(i)^(3) , n(i).(2.3) 



G 



kl 



TC{2,3} 



ikl 



'-kl 



Q 



kl 



Q 



kl 



where is independent of the rows and columns in T U {1}. Using the notation (1 U) for ({1} U U), one 

can check that a solution for Q is given by 



E 



(-1)1 



1-|T|^(1U) 



(8.36) 



Thus Q 



(l).(T) 
kl 



U:TCUC{2,3}\{feJ} 
.(1) 



if fc or Z e T. By definition of Zll' (8.2) and (8.35), we have that the Q's satisfy (8.27). 



For any fixed T, Q),/' is independent of the rows (column) in T U {1}. Thus we proved (8.28). 

In order to prove (8.30), we give another representation of the Q's. We begin by removing the dependence 
of the {kl) matrix element of the Green function on the index 3 for k,l > 3. By (4.15) or (4.16), we can 
rewrite the first term of r.h.s of (8.9) as 



G 



^(12)^(12) 
(12) _ ^(123) <-^fc3 '^31 
kl ~ '-'kl 



G 



(12) 
33 



fc,/^ {1,2,3}. 



i.37) 
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f 12) 

This removes the dependence of G^. ; on the index 3 with the last term as the error term. For the last term 
on r.h.s of (8.9), using (4.15) and (4.16) again, we have 

^ ^ ^(1)^(1) _ , ^ ^(1)^(1) _ , ^ ^(1)^(1) 

^(1) _ ^(13) '-^fc3'-^32 ^(1) „ ^(13) '-^23'^3i ^(1) _ ^'(13) 4. '^Z 3 '^3 2 /o 00-, 

'~'k2~'-'k2 T(T) ' '-'21 -'-'21 "I —(T) ' '-^22 -'-'22 "1 ^ (T) • [S,.6!i) 



The last equality implies 



"-^33 "-^33 ^33 



^^(1)^(1) 
"^2 3 '-^3 2 



r<{i) r^(i3) ^(1)^(1)^(13) 

^22 "^22 '-'3 3'-^2 2'-^2 2 



(8.39) 



This removes the dependence on the index 3 of both the Green functions and their inverse in the last term 
in (8.9). Inserting (8.37)~(8.39) into (8.9), we obtain that if fc,/ ^ {1,2,3} 

, . (^(12)^(12) / ^(1)^(1) \ / , ^ ^(1)(^(1)\ / -I /^(l)^(l) \ 

rW _ ^(123) , '^kS '^31 , ( (-i(13) , '-^fc3'-^32 \ / ^(13) '^2 3 '^3 i \ i _t '^2 3 '-^3 2 \ /n .r,^ 

'^kl-'^kl + (12) +I^fc2+ (1) I l<-2( + (1) 11^(13) ^(1)^(1)^(13) I ■ ^^-^"^ 

So for k, I ^ {1, 2, 3}, we define Q^/'*'^ as follows 

^(12)^(12) r^(13)^(13) 

^(1),(2,3) _ ^(123) ^(1),(2) _ ^k3 '^Sl n(l)'(3) _ '-'fc2 '^2 1 

'^kl -'-'kl ' '^kl - (12) ' ^fcl - „(13) 

^-^3 3 '-'2 2 

^(1)^(1)^(13) ^(13)^(1)^(1) ^(1)^(1)^(1)^(1) ^(13)^(1)^(1)^(13) 

r)(l)'0 _ '-'fc3'-^3 2'-^2i I '-'fc2 '-'2 3'-'3i , '-'fc 3 '-'3 2 '-'2 3 '-'3 i _ '-'fc 2 '-'2 3 '-'3 2 '-'2 i (o 41 ^| 

"^kl ^(1)^(13) ^ ^(1)^(13) ^ ^(1)^(1)^(13) ^(1)^(1)^(13) 

'-'33^22 '-'33^22 "^3 3 '"'3 3 '-'2 2 "^3 3 ^2 2 '-'2 2 

^(1)^(1)^^(1)^^(1) /^(13) ^(13)^(1)^(1)^(1)^(1) ^(1)^(1)^(1)^(1)^(1)^(1) 

'-'fc3'-'3 2'-'23'-'32'-'2; _ '-'fc 2 '-'2 3 '-'3 2 '-'2 3 "^3 i _ '-'fc 3 '-'3 2 '-'2 3 '-'3 2 '-'2 3 '-'3 i 
r^(l)^(l)r-(13)^(l) ^(1)^(1)^(13)^(1) ^(1)^(1)^(13)^(1)^(1) • 

^33'-'22"-'22 "-'33 "-'3 3 '-'2 2 "-'2 2 "-'3 3 "-'3 3 "-'2 2 '-'2 2 "-'33'-'33 

(1) (T) 

One can see that in this case, fc, / ^ {1, 2, 3}, (8.35) holds and Ql-i s arc independent of the rows (column) 
in T U {1}. For fc = 2, 3 or / = 2, 3 the previous formulas for Q do not make sense. But in this case, we do 
not need to decompose G^^^ in such fine details and we will use the simple decomposition 

^(1)^(1) 

^2 1 =^2i "I t;(T) — ' I and G^i =€23, 1 = 3, 
^33 

(1) (12) <^3^2<^2V (1) (1) 

^3/ =^3; ~^ ;;(T) ' '7^2 and G3, =G32, 1 = 2. 



'-'22 



More precisely, we define Q(i)'W by 



1. For k = 2 and I + 3, 0^^^^'^^ = ^^"^ - 0, Q^'^^^ = G^^^ and = ^f#. 

"^3 3 

2. R3r fc . 2 and ^ = 3, C^^^^ = Q^^^ = C^^^ = and - G^^'. 

3. R.r fc = 3 and / ^ 2, = C^^^^ = 0, Q^^,^^^^ = gI^' and Q«-« ^ 4^^. 
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4. Jbr fc = 3 and / = 2, Q^]^^^^^ = Q^'^^) = Q^'^^) = and qW^« = G«. 

Similarly, wc can define Qfi)'^ for the cases / = 2 or ^ = 3. Wc now list the properties of Q^,]^'^^"* for k,l > 1 
and T C {2,3}: 

1. Q^/ s are independent of rows (column) in T U {1} and (8.35) holds. 
2. 

gW.m ^ Q if fc or Z e T. (8.42) 

3. If fc = / and T U {k} = {2, 3}, then 

QiT'=Gi7'- (8.43) 
For all other cases, Q^/ is a finite sum of terms of the form: 

(8.44) 

where each Go (Gd resp.) represents some off-diagonal (diagonal resp.) matrix element of G^^-* with 
U some finite set. Furthermore, for fc 7^ Z or T U {k} ^ {2, 3}, the number of the off-diagonal elements 
in the numerator of (8.44) is strictly bigger than |{2, 3}\(TU {fc, /})|. Using Lemma 8.1, in the set fi*^ 
we have 

IQIV'^^^I <^(^'^'''^^^^''^'''^^"'' + l(TU{fc} = {2,3},fc = /)) . (8.45) 

Since the probability of the exceptional set f2 is extremely small, a simple argument which we repeated many 
times shows that it can be neglected in the estimate of the expectation in (8.30). Hence (8.30) follows from 
(8.45). 

The proof of (8.30) shows clearly the approach to remove an element one by one from the Green function. 
Define as follows (like Q's in (8.36)) 

E (-l)l"l-l^lGir. (8.46) 

U:TcUC{2,3,4}\{fc,i} 

Using the same method we used for Q's, one can prove the properties of i?'s in Lemma 8.3. The details will 
be omitted since we will prove the general cases in the next section. 



9 General case 

The first step to prove the general cases of Lemma 5.2 is to extend the decomposition (8.27). For any fixed 
i, 1 < i < A^, and a fixed set § = {ii, 12, . . . , such that i 1 < ij < N , our goal is to decompose Z^l' 
so that the following lemma holds: 

Lemma 9.1 For i ^ §, T C § and rj > 1/N, there is a decomposition of 

TCS k,l 
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such that 

(1) S^.';'^'^'''"' is independent of the rows or columns of H in {i} U T, i.e., 



gg(j),S,(T) gn{i)MV 

— = — 

da? ' 9a? 



0, — =0, ae{i}UT, l<b<N. (9.2) 



(2) For any positive integer k, 

k 



E 



2,(*).S.(T) 



< Cu,s {{log Nf+^"f {X'~'+^)\ s = |§|, t=|T|. (9.3) 



In the applications, S will be the set of indices, the dependencies of which we wish to isolate in z[f . For 
example, for the case i = 1 and § = {2} or § = {2, 3}, respectively, if we define 

^(1),{2},(T) ^ ^1 . p(l),(T)al, ^(1),{2,3},(T) ^ ^1 . Q(1),(T)^1^ (g 4) 

then (9.3) follows from (8.14), (8.15) and (8.30). 

To achieve the decomposition (9.1), as in (8.36) in Section 8.2, we start with a decomposition on g[,*]. 

Definition 9.1 As in Lemma 4-3, we use the notation (iT) for {{i} U T). For l<i<N,i^S = 
{ii, ^2, . . . , is} and T C §, we define 

gW.s.(T)^ ^ (-l)l"l-l-lGi:7). (9.5) 
U:Tcucs\{fc, ;} 

For example, by (8.36), for the case S = {2,3} and i = 1, we have 9^^'^^''^^'^'"^'' = QIV'^^^' ^'^^ 
§ = {2} and i = 1, from (8.11) and (8.12) we have SiV'^^^'^'^'' = -Pfe;''*^'- 



From this definition one can easily check that 
1. 

2. For k,l ^ TU{i}, 



gWAm^O, if fc or ; e TU{z}. (9.6) 



gW,S,(T) ^ g(j),S\{fc, i},(T) ,g 



3. is independent of the rows or columns of H in {i} U T, i.e., 

o(,(i),§,(T) „(,(0,S,(T) 

9^^^°' '^^W^^' 1<^<^- (9-8) 

4. All quantities defined so far depend on the initial matrix H, omitted in our notations. If we wish 
to specify which matrix is being considered, we will insert the matrix. For example, g(^'|'^\^'''(ij(T)) 
means it is defined w.r.t. H^'^^ which is the iV — |T| by iV — |T| minor of H after removing the rows 
and columns in T. Clearly, we have the relation 

gW,S.(T)(^^^gW.S\TJ(^(T))_ (9 9) 
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With these definitions, we can decompose G^*) as follows. 
Lemma 9.2 For fixed i, S — {ii, 12, ... , is} such that i ^ §, we have the decomposition 

G^l-ES^r^""'- (9-10) 



TcS 



Proof. Using the definition (9.5), we have 



TcS TcS \U:TcUCS\{fc, i} / UCS\{fc, 1} \TcU / 

Since X]tcu(~-'^)''^' ^ unless U = 0, we obtain (9.10) and this concludes Lemma 9.2. q 

For the special case i = I and § — {2, 3}, S^*;'^'''''' = Qfe/ '"^'' satisfies the estimate (8.45). We now prove 
a general form of this estimate on S^*;'^'^'"'''. 

Lemma 9.3 Let 1 < i < N and T C § = {ii,«2, • ■ • ,is} such that i ^ S. Then there exists a constant C, 
depending only on s, such that 



qW,s,(t) 



<C'(l(TU{fc}=§,fc = 0+Xl=^\(^u{'^' inVL", (9.12) 
for sufficiently large N depending only on s. 



This lemma is the basic estimate for a power counting argument. It shows that the ofF-diagonal elements 
of S^*]'^''"^'' are small by a certain power of X, which is our small parameter, depending on the size of the 
sets § and T. The diagonal elements, when not zero by definition, are estimated by 1 (first term in (9.12)), 
but their contribution to the moments of 2.^*^'^'^^-' will be small since k = I reduces the double sum in (9.1) 
to a single sum. 

Proof of Lemma 9.3. For k = I, the estimate (9.12) follows directly from (9.5) and (8.5). We can thus 
assume that k ^ I throughout the proof of this lemma. The argument consists of two parts. First we prove 
a representation formula (Lemma 9.4) that asserts that S).'; is a certain rational function involving 
resolvent matrix elements of H and some of its minors. The denominators in this rational function are 
products of diagonal elements of resolvents and the numerators are products of off-diagonal matrix elements. 
In the second step we will estimate these rational functions, using that the diagonal elements of the resolvent 
are typically separated away from zero and the off-diagonal elements are small by a factor X. 
For the precise argument, we start with the cases: 

T = 0, fc, / ^ § and § 7^ 0. (9.13) 

The special case S = {2} can be proved by the representation (8.11) and Lemma 8.1. The case S ~ {2,3} 

Jkl 



was proved in (8.45). These examples show that S)^; can be written as the finite sum of the terms of 



the form: 

(9.14) 
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where Go are ofF-diagonal elements of some G'^'^^ and Gd are diagonal elements. Furthermore, in each term, 
the number of the off-diagonal elements in the numerator is strictly greater than s = |§| but less than 4^*. 
The number of the diagonal elements in the denominator is also less than 4l^l . 

The Green function g[.*J^'' can be viewed as a function from the vector space of matrices. This motivates 
the following definition. 

Definition 9.2 Denote by the space of K x K matrices and X = ^'^^i'Xk- Define V as the set of 
functions from X to the complex numbers. For any S = {ii, 12, ... , is} and for any i,k,l ^ S, define the set 
of off-diagonal matrix elements considered as functions of matrices: 

A^f = {/ e y : f{W) = G^;P(W), for some j ^ f , jJ'eSU {k, I}, U C §} , (9.15) 

where W £ for some K . Similarly, we define the set of diagonal matrix elements: 

Ti^]'^^[f e^:f{W)^G^;p{W):jeS^{Kl}, U C sj . (9.16) 

Furthermore we define C^']'^ for all k,l as 

/1/2 ■ ■ ■ fm 



'orm ± 



gi52 • • • .gm' 



= ^F G y is a finite sum of functions of the f 

/a e yii^t^ 1 < " < gp^'^kl^i l<P<m'; s + l<m<4", 0<?«'<4''|, (9.17) 

where s = |§| . 

Notice the important condition m > |S| + 1 in the definition of Cj^'/'^. Since off-diagonal matrix elements 
are typically small, this requirement will guarantee the smallness of C^*;'^ as a certain power of X . 

(1) 12 31- 

With these notations, the equation (8.41) asserts that for k,l ^ {2,3}, there is a function i^; / ' G 
e^V'^^'^^ such that 



g(l),{2,3},0^^(l),{2,3}^ (9.18) 

The general case is the following lemma. 

Lemma 9.4 For any § = {ii, 12, ... , is} with s > and i,k,l ^ §, there exists a function Fj,^j'^ S 6^)'^ such 
that 

5^^^ = fS"". (9.19) 

Proof of Lemma 9.4-' By symmetry, we only need to prove the cases that 

i = l, § = {2,3,...,s-Hl}. 

To prove this case, wc argue by induction on s. For s = 1 or 2, Lemma 9.4 was proved in (8.11) and (8.41) 
(cf. (9.18)). Suppose that Lemma 9.4 is correct for s n — 1 > 1 and pj^^^'^^' - '"^^ g gW. {2, ■■■,«} 
function satisfying (9.19) for i = 1 and § = {2, 3, . . . , n} 
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Now let i = 1, S = {2, . . . , n + 1} and k, I ^ {1, . . . ,n + 1}. By the induetion assumption, 

g(lM2,...,„}J ^ ^(iW2,...,n} (9 20) 

with F^(l)'{2,...,"} ^ gjjj^g 

sum of elements of the form 




:,(iUo 

\a=l / 

where Va C§>, n <m < 4"-\ < m' < 4"^^ and 

e{2,3,...,n}U{fc}U{?}. 

By definition of 9^^'^'^^^ in (9.5), we have 

g(l),{2,....,n+l},0 ^ g(l),{2,..n},0 _ g(l),{2,...,n,n+l},(n+l) /g 22) 

Combining (9.9) with (9.20), we have 

gW.{2,3...,n,«+l},(«+l)^^^ = ^aM2,...,«}^^(„+i)^^ 23) 

where is the minor of W with the (n + l)-th row and (n + l)-th column removed. 

We can remove the dependence on the {n + l)-th row by the procedure in (8.37)-(8.39). Using (4.15), 
(4.16) and the notation: 

(lUn+ 1) = ({l, n.+ 1} UU) , (9.24) 

we have the expansion 



gS^=^si"r"^^+ '^:i^.r'" ^ i^^^™ (9-25) 

^ra+l n+1 



^(lUc) ^(1U„) 
^jo, n+l^n+1 



and 



1 1 <j-j^„+i<j-„+ij^ 



m + 1 < < m + /fc'. (9.26) 



We note that the first term on the r.h.s of (9.25) is exactly the Green function on the l.h.s of (9.25) except 
that there is an additional superscript n+1; the similar comment applies to (9.26). 

Inserting (9.25) and (9.26) into (9.21) and expanding it, we obtain that (9.21) is equal to 

± (ft Gj!": (n Gr;;)!:r ') + o^^- (9.27) 

Here the first term in (9.27) is the product of the first terms on the right side of (9.25) and (9.26) and it is 
the same as (9.21) except that there is an additional superscript ti + 1. One can see that the other terms 
in (9.27) are elements in gW.{2,. .,n+i}^ ^ ^-^^ number of the off diagonal terms in the numerator is now 
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at least n + 1. Since this procedure can be applied to each term in pj^^^'^'^----'^^ ^ have proved that there 
exists an F e ..,«+!} ^^^^-^ ^^^g^^ 

= g W,{2,...,«+i}.("+i) (yy) + (9 28) 

By (9.22) and (9.23), we can set j^W,{2,3...,n,«+i}|-^^ ^ ^^^^ ^^^^^ gaM2,3,...,n,«+i} ^^^^ 
thus proved Lemma 9.4 by induction. q 



Now we start proving the estimates in Lemma 9.3. Using (9.6) and (9.7), we only have to prove (9.12) for 

the case k,l ^ SU {i}. 



gW.MT)^^(.s)^ (9.29) 



Case 1. T = S.- By definition, 

^ki ~ ^kl 

Then (9.12) in this special case follows from Lemma 8.1. 

Case 2, T = (d, k,l ^ S and § ^ %: By Lemma 9.4 and 8.1, for any S 7^ such that i,k,l ^ S, we have 

, s+l 



ISit I < " < CX'+\ (9.30) 

(^minucsj 

where C depends on s = |S|. 

Case 5, T ^ 0, T C S, T 7^ §, fc, / § and S 7^ 0.- By (9.9) and Lemma 9.4, there exists a function 
HT^'' e ei']^'\^ (see (9.17)) such that 

g«.s,(T)(^^ = S^l^'^^H^^y) = F«-=^\^(ijm), (9.31) 

where is the A^- |T| by iV- |T| minor of H after removing the rows and columns in T. Thus Si, , is 
given by the function Pj^^j'^'^'^ with all Green functions in the definition of -F^*;'^^^ replaced by G^^^^^ 
From (9.30) we have 

|3»,B,,T,|,^ (°'°™.|G,. ^^^^^ 

( mmucS\Tj I) 



|g«,s,(T)|^^^|s\T|+i^ in 17= (9.33) 



Using Lemma 8.1, we have that 

'^^kl 

where C depends on s. We have thus proved (9.12) for the Case 3 and this completes the proof of Lemma 
9.3. □ 



Proof of Lemma 9.1. The decomposition (9.1) follows from (9.10) and (9.2) is a direct consequence of (9.8). 
The estimate (9.3) can be proved in the same way as in the proof of Lemma 8.2 using the following three 
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ingredients: (1) The bounds on 

,WA(T) 



3(i),S,(T) 
^kl 



in (9.12). (2) The large deviation estimate in Lemma 4.4. (3) The 



trivial bound |S^; | < C/r] < CN where C depends on This concludes the proof of Lemma 9.1. q 
Proof of Lemma 5.2. We first introduce the following notations which will be useful for the expansion of the 



p-th moment of 



Ej=i^^ in (5.5). 

Definition 9.3 1. Let V = (ui, U2, . . . , Vp) be a p dimensional vector such that Vi ~ or 1 for 1 < i < p. 

2. Let S ~ (ofi, Q!2i ■ • • 1 otp) he a p dimensional vector such that 1 < < N for 1 < i < p. 

3. Denote by § the set consisting of elements aj which is a component ofS. 
We define 



A(S,V) =E]^ (B^'^ Z^^) , B\a + ib) = a - ib, B°{a + ib) = a + ib, 



(9.34) 



where B is the complex conjugate operator. 



Through the rest of this section, S is always the set generated by S. Notice that |S| = s < p where p is the 
number of components in S. With these notations, we can estimate E Zi\^ by 



E 



<5^^|A(S,V)|<C,5]iV^ max |^(S,V)| 



s V 



s<p 



S,V:|S|=s 



(9.35) 



where we sum up S and V under the conditions in Definition 9.3. Lemma 5.2 is now a simple consequence 
of the following estimate on |A(S, V)|. 

Lemma 9.5 Let S, S and V satisfy the conditions in Definition 9.3. With yl(S, V) defined in (9.34), there 
exists a constant C > depending on p such that 

I A(S, V) I < C ((log Nf+^'^Y NP-'X^P, (9.36) 

for sufficiently large N depending only on p. 

Proof. Let I < i < p, denote the set = §\{Q:i}. Using (9.1), we expand A(S, V) as 

A(S,V) = E ^ ... ^ yl(Ti,T2,...Tp,V), (9.37) 

TiCSi TpCSp 

A(Ti,T2,...Tp,V) = (^B"iIE„,Z("i)'^i'(^i') (^B'"2IE„,Z("^'^^=^(^=)) ••• 
From the Schwarz inequality, (9.3) and = .s — 1, we obtain that 

|EA(Ti,T2,...Tp,V)| < C((logiV)3+2")^x(p^-^-il^'l), (9.38) 
where C depends on p. Suppose that 

p 

X^IT^I < sp-2s. (9.39) 



z=l 
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Using (8.20), i.e., X"^ > {\ogNf/M > l/N, we have 

|E^(Ti,T2,...Tp,V)| <C {{log Nf+'^'^Y X^' < C {{log N)^+^°'Y NP-'X^p. (9.40) 

It remains to estimate E^(Ti, T2, . . . Tp, V) for the cases that 

p 

^|T,| > sp-2s + l. (9.41) 
For 7 € §, denote n-y to be the number of times that 7 appears in {ai} U Ti, {02} U T2, . . . and {ap} U Tp, 



p 



J]l(7e{aJUTfc). 



fc=i 



By definition, > 1. Similarly, we define m-y to be the number of times that 7 appears in (ai, a2, • • ■ Q^p), 



p 



fe=i 

Let X = \{j £ §> : Uj ~ p}\ and j/ = [{7 e § : nij = 1}|. Since for each fixed i, at ^ T^, then with (9.41) and 
the definition of n^, 

p p 

{p -l){s-x)+ xp > XI "7 = U T,| = p + X |T, I > sp - 2s + p + 1. (9.42) 

-ygS i=l i=l 

By definition of m-y , we have 

y + 2{s~y)<J2'^'*=P- (9-43) 

From the last two inequalities, we have x + y > s + I and thus there exists a 7 G S such that 

n-y — p and m-y — 1. (9.44) 
Without loss of generality, we assume that 7 = ai. Then using (9.44), we know 

j^ak, 7GTfe, if fc^ 1. (9.45) 

Then with (9.45), the decomposition Z(0.s,(T) ^ Efe,/ afc9fc P'^^'^; (9-1) ^nd the property that 9^;'^'^^' is 
independent of the row or columns of in {i}UT (9.2), we have that for k ^ 1, the Z"fc.Sfc.(Tfc) ig independent 
of a''. By the definition of IE, for A: = 1, we also have 

Ea7lEa»i Z("i)'^i'(''i) = Ea-.IEa-.Z^'^)'^!'*^!) = 0. (9.46) 

Therefore, under the assumption (9.41) we have 

EA(Ti,T2,...T3,V) = E (B^iIIEa-iZ^"^^'^^'^'"^^) (|B"^IEa»2 

= E (^Ea7B"iIEa»lZ^"i)'^^'''"i^) (B"=IEa"2Z^"")'^"'^^")) ••• =0. 

Combining this identity with (9.40), we obtain (9.36) and thus conclude Lemma 9.5. q 
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